lecture #9, may 3, 2007
DESCRIPTION
Lecture #9, May 3, 2007. Project #2 Peephole optimizations Midterm Histogram x x x xx x x x xx x xx xxx x x x x x ------------------------------------ 30 40 50 60 70 80 90. Assignments. - PowerPoint PPT PresentationTRANSCRIPT
Cse322, Programming Languages and Compilers
104/21/23
Lecture #9, May 3, 2007•Project #2•Peephole optimizations
•Midterm Histogram
x
x
x xx x x x
xx x xx xxx x x x x x
------------------------------------
30 40 50 60 70 80 90
Cse322, Programming Languages and Compilers
204/21/23
Assignments
• Project #1 is due today.– Email me your solution by Midnight tonight
– All I want is your “Phase1.sml” file.
– PLEASE put your name as a comment in the file.
• Project #2 is officially assigned Tuesday May 8.– Due 2 weeks from then, Tuesday May 22
– The template will be made available on Tuesday
– We will talk about it today in class
• Reading– Optimizations
– Chapter 8 Section 8.4
– Chapter 10 Sections 10.1 – 10.3
Cse322, Programming Languages and Compilers
304/21/23
Project 2
• Project 2 has three parts1. Putting IR code in canonical form
» See lecture 8 (More about IR1)
2. Finalization of offsets
3. Writing a simple peephole optimizer for IR1
• Project #2 is Due on Tuesday, May 22, 2007– The template contains a complete solution to Project 1, so you
might not want it until you hand in Project 1.
– You may start Project 2 by using only the IR1.sml file
– The template provides a mechanism for testing your code by parsing, and generating IR code for you to transform. It is not necessary to have the template to get started.
Cse322, Programming Languages and Compilers
404/21/23
Canonical form
• Using the starting point discussed in Lecture 8 you should write a function that takes a IR.FUNC list to a IR.FUNC list
• It should remove all ESEQ constructors.• The only expressions left should be pure
ones without any embedded statements.
• This is a straightforward walk over all the IR datatypes, as illustrated in lecture 8.
• Just complete the code in S08code.sml from the notes webpage
Cse322, Programming Languages and Compilers
504/21/23
Finalizing offsets• Recall, method parameters (PARAM), local
method variables (VAR), and object instance variables (MEMBER) are all logical indexes.
• The integer is the nth parameter, variables, or instance.
• We need to translate all these to a physical offset
• This requires computing the size of all parameters, variables, and instances variables and assigning an offset to each one.
• Assumptions– All variables have the same size (4 bytes)
– Information about variables can be computed from information in the FUNC datastructure. True only about parameters and local vars.
Not always the case for
instance variables
Cse322, Programming Languages and Compilers
604/21/23
Peephole optimization• After canonicalization we often generate code that
could be simplified by looking at a small window of IR statements.
• For example useless jumpsL0: if MEM(V1) == 1 GOTO L1 % Entry: x
JUMP L4
L4: if MEM(V2) == 1 GOTO L5 % Entry: y && (!z)
JUMP L2
L5: if MEM(P1) == 1 GOTO L2 % Entry: !z
JUMP L1
L1: T0 := 1 % True: x || (y && (!z))
JUMP L3
L2: T0 := 0 % False: x || (y && (!z))
L3: % Exit: x || (y && (!z))
• You are to write a peephole optimizer that removes useless jumps at the minimum. You may add other optimizations.
• Extra credit for each additional optimization.– To get credit you must:
– Explain each optimization
– and provide tests that illustrate it
Cse322, Programming Languages and Compilers
704/21/23
More about Initialization and offsets of instance vars
• Finalizing offsets of instance variables is tricky– class R { int x =0; int y =1 }– class S extends R { int x=2; int z = 3}– class T extends S { int y = 4; int w = 5}
– x has offset 0
– y has offset 1
– z has offset 2
– w has offset 3
– But in S, x appears to have offset 0, and z appears to have offset 1.
• Initialization is also tricky– R { x =0; y = 1}
– S {x=2; y=1; z= 3}
– T {x=2; y = 4; z = 3; w = 5}
Cse322, Programming Languages and Compilers
804/21/23
Where is this information?
• We need to decide how to maintain and use this information.
• By the time the ProgramTypes code has been translated to IR1, this information is sometimes missing.
• We need to do 2 things– We need to construct a table, indexed by class and instance
variable name.
– Make sure both class name and instance variable name are available
• We need both the instance variable and the class name to access this information– obj.x Member(loc,obj,R,x)– obj.x = 25 Assign(SOME obj,x,NONE,25)– obj.x[i] = 25 Assign(SOME obj,x,SOME i,25)
Note class name is
missing from assignments
Cse322, Programming Languages and Compilers
904/21/23
Class Tableclass R { int x =0; int y =1 }class S extends R { int x=2; int z = 3}class T extends S { int y = 4; int w = 5}
datatype entry = entry of string * (string* int* Exp option) list;type table = entry list;
We must build this from ProgramTypes before translating,and use it in the finalizationof offsets phase. It is alsouseful in the translation toIR1 phase (for the new object)expression.
R x 0 =0
y 1 =1
S x 0 =2
y 1 =1
z 2 =3
T x 0 =2
y 1 =4
z 2 =3
w 3 =5
class variable offset initialization
Cse322, Programming Languages and Compilers
1004/21/23
The Class table
datatype entry =
entry of string *
(int *
Type *
string *
Exp option) list;
type table = entry list;
val classTable = ref ([]: entry list);
Global reference variable, is set by the type checker.
Cse322, Programming Languages and Compilers
1104/21/23
Class Tableclass R { int x =0; int y =1 }
class S extends R { int x=2; int z = 3}
class T extends S { int y = 4; int w = 5}
datatype entry =
entry of string *
(int* string*
int*
Exp option) list;
type table = entry list;
R x 0 =0
y 1 =1
S x 0 =2
y 1 =1
z 2 =3
T x 0 =2
y 1 =4
z 2 =3
w 3 =5
class variable offset initialization
Cse322, Programming Languages and Compilers
1204/21/23
Fixing thingsclass R { int x =0; int y =1 }
class S extends R { int x=2; int z = 3}
• super sub
• fix {int x =0; int y =1} with {int x=2; int z = 3}
• {int x =2; int y = 1; int z = 3}
• The position in the super class is kept, but the initialization of the sub class is kept.
• Algorithm. For each var in super, scan over sub looking for variable. If its there, replace the initialization in super, and remove it from sub.
• After all super’s are scanned, add any subs left to super.
Cse322, Programming Languages and Compilers
1304/21/23
ML codedatatype entry = entry of string * (string*int*Exp) list;type table = entry list;
fun scan vSuper [] = (NONE,[]) | scan vSuper ((vSub,init)::xs) = if vSuper = vSub then (SOME init,xs) else let val (exp,xs2) = scan vSuper xs in (exp,(vSub,init)::xs2) end;
fun number n [] = [] | number n ((v,exp)::xs) = (v,n,exp)::number (n+1) xs
fun fix n [] sub = number n sub | fix n ((s,exp)::ss) sub = case scan s sub of (NONE,xs) => (s,n,exp):: fix (n+1) ss xs | (SOME init,xs) => (s,n,init):: fix (n+1) ss xs
scan over sub looking for variable. If its there, replace the initialization in super, and remove it from sub.
Cse322, Programming Languages and Compilers
1404/21/23
Does the order matter?
• Note we must process the super of the super (if any) before we process the subclass, or it won’t have its position correct.
• Solution.– Perform an toplological sort
– Use the class table (CTab) returned by the type checker to get the order correctly.
Cse322, Programming Languages and Compilers
1504/21/23
This code is in the templatefun cName (ClassDec(loc,this,super,vars,methods)) = this;fun cVars (ClassDec(loc,this,super,vars,methods)) = vars;
fun findInstVars name [] = [] | findInstVars name (c::cs) = if cName c = name then let fun project(VarDecl(l,t,n,i)) = (n,i) in map project (cVars c) end else findInstVars name cs;
fun process n "object" sub classes = entry(sub,fix 0 [] (findInstVars sub classes)) | process n super sub classes = entry(sub,fix n (findInstVars super classes) (findInstVars sub classes))
Cse322, Programming Languages and Compilers
1604/21/23
Small Changes to Program Types
• Old
datatype Stmt
= Assign of Exp option * Id * Exp option * Exp
• Newdatatype Stmt
= Assign of (Exp*string) option * Id *
(Exp*Basic) option * Exp
This information is placed there by the type checker.
Cse322, Programming Languages and Compilers
1704/21/23
Example use: obj.x = 99
class T {
int instance2 = 0;
public int f(int j) { return j; }
}
class test05 {
int instance1 = 0;
public int test(int param1, T object1) {
int var1 = 0;
object1.instance2 = 99 }
Cse322, Programming Languages and Compilers
1804/21/23
Translatingfun pass1E env exp =
case exp of
Assign(SOME (obj,class),x,NONE,v) =>
(* non-array e.x = v *)
let val target = pass1E env obj
val addr = AddressOfMember env target class x
val value = pass1E env v
in [MOVE(addr,value)] end
MEM(P2) + 1 := 99
Adds the offset of x in class to the address target
Cse322, Programming Languages and Compilers
1904/21/23
Notes about Project 2
• The class Table– I have installed a class table that is initialized by the type checker.
– All the pertinent information about classes and instance variables is stored in the table.
• The drivers– The drivers give you means to run the parser, the type checker,
and the ir1 translation mechanism,
– You may either return the data structures or print them out.
• templates for the three transformations– I have provided a template for the three transformations.
Cse322, Programming Languages and Compilers
2004/21/23
Example information
class T has vars:
0: int instance2 := 0
class S has vars:
0: int instance2 := 1;
1: int y := 5
class R has vars:
0: int instance2 := 0;
1: int y := 6;
2: int w := 10
class test05 has vars:
0: int i0 := 0;
1: int i1 := 1
class T { int instance2 = 0;}
class S extends T { int instance2 = 1; int y = 5;}
class R extends T { int y = 6; int w = 10; }
class test05 { int i0 = 0; int i1 = 1;}
Cse322, Programming Languages and Compilers
2104/21/23
Access to the information
• You may access the information by fetching the table from the reference variable
– (! TypeChecker.classTable )
• Or you may print it out using
– TypeChecker. showTable ()
Cse322, Programming Languages and Compilers
2204/21/23
Template Drivers
• In the Driver file are a number of drivers you can use to access the parser, the typechecker, and the IR-translator.
fun parseFileToList file = parse file true
fun parseAndTypeCheck file =
TCProgram(parse file true);
fun parseTypeCheckPass1 file =
case parseAndTypeCheck file of
(classes,env) => pass1P [] (Program classes)
Cse322, Programming Languages and Compilers
2304/21/23
Showingfun showParsedProgram file =
case parseFileToList file of
Program cs => print(plistf showClassDec "" cs);
fun showTypeCheckedProgram file =
case parseAndTypeCheck file of
(classes,env) => print(plistf showClassDec "" classes);
fun showPhase1IR file =
case parseAndTypeCheck file of
(classes,env) =>
let val cs = pass1P [] (Program classes)
val _ = print "================================="
val _ = TypeChecker.showTable()
val _ = print "=================================\n"
in print(plistf IR1.sFUNC "\n" cs) end;
Cse322, Programming Languages and Compilers
2404/21/23
Templates for the three transformations.
structure Phase2 = struct
fun cannonical x = x;
fun finalizeOffset table x = x;
fun peephole x = x;
Cse322, Programming Languages and Compilers
2504/21/23
Writing the transformations.
• The work of the transformations is done on the Exp and Stmt level. But the transformations work over programs.
• We need to drill our way down to the parts that matter.
Cse322, Programming Languages and Compilers
2604/21/23
Cannonicalfun cannonical (Program cs) =
map cannonicalC cs;
fun CannonicalC (ClassDec(loc,name,super,vs,ms)) =
ClassDec(loc,name,super
,map cannonicalVs vs
,map cannonicalMs ms)
fun CannonicalMs (MetDecl(loc,typ,nam,ps,vs,stmts)) = . . .
Cse322, Programming Languages and Compilers
2704/21/23
Finalize
• Finalize has a similar structure, but also takes a class table as input.
• This needs to be piped down as well.
• This will be useful when finalizing offsets for member access and assignment.
Cse322, Programming Languages and Compilers
2804/21/23
What to turn in
• I will provide a template containing a parser, pretty printer, and a type checker, just as before, with the small changes I mentioned.
• You will need to add the code for building and passing around the class table.
• Use your own IR translator, and add – a post processing canonical phase
– A finalization of offsets
– A simple peephole optimizer
• Hand in just this one file.
Cse322, Programming Languages and Compilers
2904/21/23
Optimization• We will look at a number of optimizations to low
level code.
• Peephole• Local Optimizations
– Constant Folding– Constant Propagation– Copy Propagation– Reduction in Strength– In Lining– Common sub-expression elimination
• Loop Optimizations– Loop Invariant s– Reduction in strength due to induction variables– Loop unrolling
• Global Optimizations– Dead Code elimination– Code motion
» Reordering» code hoisting
Cse322, Programming Languages and Compilers
3004/21/23
Inefficiences• Note that automatic translation schemes
leaves much to be desired. Consider
Push r13 push it as an arg to -
Movi 1 r14 r14 := 1
Push r14 push it as an arg to -
Pop r15 get args to -
Pop r16
Prim - [r15 r16] r10 r10 := x2 -1
• In a stack machine, we push arguments on the stack to protect them from recursive calls, only to pop them without any recursive calls most of the time.
Cse322, Programming Languages and Compilers
3104/21/23
Another Example
Pop r9 pop the result of recursive call
Push r9 push it as arg to *
Pop r17 pop the two args to times
Pop r18Prim * [r17 r18] r6 perform the multiply
• Here we pop things, only to immediately push them back on the stack.
Cse322, Programming Languages and Compilers
3204/21/23
Peep Hole optimizationsPush r13 push it as an arg to -
Movi 1 r14 r14 := 1
Push r14 push it as an arg to -
Pop r15 get args to -
Pop r16Prim - [r15 r16] r10 r10 := x2 -1
• In the first example r14 is never mentioned anywhere but in those two instructions. So we could remove the Push ; Pop sequence by renaming r15 by r14 everywhere .
Push r13 push it as an arg to -
Movi 1 r14 r14 := 1
Pop r16Prim - [r14 r16] r10 r10 := x2 -1
Cse322, Programming Languages and Compilers
3304/21/23
Code MovementPush r13 push it as an arg to -
Movi 1 r14 r14 := 1
Pop r16Prim - [r14 r16] r10 r10 := x2 -1
• Now note that the Movi instruction doesn't change the stack, so we could move it before the Push (or after the Pop) getting:
Movi 1 r14 r14 := 1
Push r13 push it as an arg to -
Pop r16Prim - [r14 r16] r10 r10 := x2 -1
• But now we have a Push Pop sequence!
Movi 1 r14 r14 := 1Prim - [r14 r13] r10 r10 := x2 -1
Cse322, Programming Languages and Compilers
3404/21/23
Peephole Pattern Matching Implementation
• Using pattern matching, this is easy to implement.
• First we need a function that in a code sequence substitutes one register for another everywhere.
• Next we need to express the patterns we are looking for.
• Finally we need to apply these patterns on every code sequence.
• What does a pattern look like?
• (Push x) :: (Pop y) :: moreInstrs
Cse322, Programming Languages and Compilers
3504/21/23
Subregfun subreg M instr =
let fun lookup [] x = x
| lookup ((y,v)::m) x =
if x=y then v else lookup m x
in case instr of
Init => Init
| Halt => Halt
| Movi(n,r) => Movi(n,lookup M r)
| Mov(r1,r2) =>
Mov(lookup M r1, lookup M r2)
| Inc(r,n) => Inc(lookup M r,n)
| Push r => Push (lookup M r)
| Pop r => Pop(lookup M r)
| Ld(r1,r2) =>
Ld(lookup M r1, lookup M r2)
Cse322, Programming Languages and Compilers
3604/21/23
Subreg (continued)
| St(r1,r2) => St(lookup M r1, lookup M r2) | Sw(r1,r2) => Sw(lookup M r1, lookup M r2) | Brz(r,n) => Brz(lookup M r,n) | Brnz(r,n) => Brnz(lookup M r,n) | Skip n => Skip n | Prim(s,rs,r) => Prim(s,map (lookup M) rs,lookup M r) | Label s => Label s | Movl(s,r) => Movl(s,lookup M r) | Goto s => Goto s | Brzl(r,s) => Brzl(lookup M r,s) | Brnzl(r,s) => Brnzl(lookup M r,s)end;
Cse322, Programming Languages and Compilers
3704/21/23
peep functionfun peep [] ans = reverse ans
| peep ((Push r1)::(Pop r2)::m) ans =
peep (map (subreg [(r2,r1)]) m) ans
| peep ((i as (Push r1)) ::
(z as ((Movi(n,r2)) ::
(Pop r3) :: m))) ans =
if r1<>r2
then peep
(map (subreg [(r3,r1)]) m)
((Movi(n,r2))::ans)
else peep z (i::ans)
| peep (i::is) ans = peep is (i::ans);
Cse322, Programming Languages and Compilers
3804/21/23
How does this work?Think of it as a pair of instruction streams where we move instructions from one stream to the other.
Push r13 push it as an arg to -
Movi 1 r14 r14 := 1
Push r14 push it as an arg to -
Pop r15 get args to -
Pop r16Prim - [r15 r16] r10 r10 := x2 -1
Push 13 Movi 1 14
Push 14 Pop15 Pop 16Prim[15,16] 10
X Y
input
ans
Cse322, Programming Languages and Compilers
3904/21/23
Examplefun peep [] ans = reverse ans
| peep ((Push r1)::(Pop r2)::m) ans =
peep (map (subreg [(r2,r1)]) m) ans
| peep ((i as (Push r1)) ::
(z as ((Movi(n,r2)) ::
(Pop r3) :: m))) ans =
if r1<>r2 then peep (map (subreg [(r3,r1)]) m) ((Movi(n,r2))::ans)
else peep z (i::ans)
| peep (i::is) ans = peep is (i::ans);
Push 13 Movi 1 14
Push 14 Pop15 Pop 16
Prim[15,16] 10
X Y
input
ans
Push 14 Pop15 Pop 16Prim[15,16] 10
X Y
input
ans Push 13Movi 1 14
Cse322, Programming Languages and Compilers
4004/21/23
Example (continued 1)
Pop 16Prim[14,16] 10
X Y
input
ans Push 13Movi 1 14
X Yans Push 13Movi 1 14
Pop 16Prim[14,16] 10
input
YX
Push 13 Movi 1 14
Pop 16Prim[14,16] 10
input
ans
input
ans
Push 13 Movi 1 14
Pop 16Prim[14,16] 10
Start over again
Y X
Cse322, Programming Languages and Compilers
4104/21/23
Example (Continued 2)
YX
input
ans
Push 13 Movi 1 14
Pop 16Prim[14,16] 10
YX
input
ans Movi 1 14
Prim[14,13] 10
YX
input
ans Movi 1 14
Prim[14,13] 10
Y XMovi 1 14
Prim[14,13] 10