Download - Basic Blocks
Basic Blocks
Mooly Sagiv
Schrierber 31703-640-7606
Wed 10:00-12:00
html://www.math.tau.ac.il/~msagiv/courses/wcc.html
Already StudiedSource program (string)
lexical analysis
syntax analysis
semantic analysis
Translate
Tokens
Abstract syntax tree
Tree IR
Abstract syntax tree
Mismatches between IR and Machine Languages
• CJUMP jumps into two labels– But typical machine instructions use one target
BEQ Ri, Rj, L
• Optimizing IR programs is difficult due to side effects in expressions– ESEQ nodes– Call nodes
• Call nodes within expressions prevent passing arguments in registers
Mismatches between IR and Machine Languages
• Call nodes within expressions prevent passing arguments in registers
binop
plus call call
Namef1
exp1
Namef2
exp2
Why can’t we be smarter?
• Avoid two-way jumps
• Do not use ESEQ expressions
Three Phase Solution• Rewrite the tree into a list of canonical trees
without SEQ or ESEQ nodes
• Group the list into basic blocks
• Order basic blocks into a set of traces – CJUMP is immediately followed by false label
nfact example
function nfactor (n: int): int=
if n = 0
then 1
else n * nfactor(n-1)
MOVE
TEMP t103
ESEQ
CJUMP
EQ TEMP t128 CONST 0
ESEQ
LABEL l0 ESEQ
MOVE
TEMP t129 CONST 1
l1l0ESEQ
JUMP
NAME l2
ESEQ
LABEL l1 ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
SEQ
TEMP t129BINOP
TIMES
TEMP t131
CALL
nfactor
BINOP
MINUS TEMP t128 CONST 1
MOVE
TEMP t103
TEMP t129
SEQ
LABEL l2
SEQ
TEMP t130
MOVE
TEMP t130
MOVE
TEMP t128
TEMP t131
SEQ
MOVE
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t30))
JUMP(NEME l2)
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME lend)
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130))
/* JUMP(NAME l2) */
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME lend)
Outline• The Cannon Interface
• Phase 1: Removal of ESEQ nodes
• Phase 2: Basic Blocks
• Phase 3: Order traces– CJUMP is followed by a false label
/* canon.h */
typedef struct C_stmListList *C_stmListList;
struct C_block {C_stmListList stmLists; Temp_label label;}
struct C_stmListList_ { T_stmList head; C_stmListList tail;}
T_stmList C_linearize(T_stm stm); /* Eliminate ESEQs */
struct C_block C_basicBlocks(T_stmList stmList);
T_stmList C_traceSchedule(struct C_block b);
/* main.c */
static void doProc(FILE *out, F_frame frame, T_stm body)
{ T_stmList stmList;
AS_instrList iList;
stmList = C_linearize(body); /* Removes ESEQs */
stmList = C_traceSchedule(C_basicBlocks(stmList));
iList = F_codegen(frame, stmList); /* 9 */
}
Canonical Trees (Phase 1)• Rewrite the tree
– No SEQ and ESEQ– The parent of each CALL is either EXP or
MOVE(TEMP t, …)
• Apply “meaning preserving” rewriting rules
• Sometimes generates temporaries
ESEQs1 ESEQ
s2 e
ESEQSEQ
s1
e
s2
BINOP
op e2
s e1
ESEQ
ESEQs BINOP
e2op e1
MEM
s e
ESEQ
ESEQ
s
e
MEM
JUMP
s e
ESEQ
SEQ
s
e
JUMP
CJUMP
s e1
ESEQop e2 l1 l1
BINOP
op ESEQ
s e2
e1
ESEQs BINOP
e2op e1
When s and e1 commutes
Which statements commute?• In general very difficult
• Example
• The compiler decides conservatively
MEM
e1
MOVE
MEM
e2
e3
Which statements commute?static bool commute(T_stm x, T_exp y)
{
if (isNop(x)) return TRUE;
if (y->kind == T_NAME || y->kind == T_CONST) return TRUE;
return FALSE;
}
BINOP
op ESEQ
s e2
e1
ESEQ
MOVE
BINOP
e2op TEMP t
When s and e1 may not commute
sTEMP t e1
ESEQ
CJUMP
s e2
e1op ESEQ l1 l1
When s and e1 may not commute
CJUMPs
TEMP top e2 l1 l1
SEQ
MOVE
TEMP t e1
SEQ
MOVE
TEMP t103
ESEQ
CJUMP
EQ TEMP t128 CONST 0
ESEQ
LABEL l0 ESEQ
MOVE
TEMP t129 CONST 1
l1l0ESEQ
JUMP
NAME l2
ESEQ
LABEL l1 ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
CJUMP
EQ TEMP t128
CONST 0
ESEQ
LABEL l0 ESEQ
MOVE
TEMP t129 CONST 1
l1l0ESEQ
JUMP
NAME l2
ESEQ
LABEL l1 ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
TEMP t103
MOVE
SEQ
CJUMP
EQ TEMP t128
CONST 0
LABEL l0ESEQ
MOVE
TEMP t129 CONST 1
l1l0ESEQ
JUMP
NAME l2
ESEQ
LABEL l1 ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
TEMP t103
MOVESEQ
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0 ESEQ
JUMP
NAME l2
ESEQ
LABEL l1 ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
ESEQ
LABEL l1 ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
ESEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
MOVE
TEMP t129 BINOP
TIMES TEMP t128 CALL
nfactor BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
MOVETEMP t129
BINOP
TIMES TEMP t128
CALL
nfactor
BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
ESEQ
TEMP t130MOVE
TEMP t130
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
MOVETEMP t129
BINOP
TIMES
TEMP t131
CALL
nfactor
BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
TEMP t130
MOVE
TEMP t130
ESEQMOVE
TEMP t128
TEMP t131
ESEQ
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
SEQ
TEMP t129
BINOP
TIMES
TEMP t131
CALL
nfactor
BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
TEMP t130
MOVE
TEMP t130
MOVE
TEMP t128
TEMP t131
ESEQ
MOVE
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
SEQ
TEMP t129BINOP
TIMES
TEMP t131
CALL
nfactor
BINOP
MINUS TEMP t128 CONST 1
ESEQ
LABEL l2
TEMP t129
SEQ
TEMP t103
MOVE
TEMP t130
MOVE
TEMP t130
MOVE
TEMP t128
TEMP t131
SEQ
MOVE
SEQ
CJUMP
EQ
TEMP t128
CONST 0
LABEL l0
SEQ
MOVE
TEMP t129CONST 1
l1l0
SEQ
JUMP
NAME l2
SEQ
LABEL l1
SEQ
SEQ
TEMP t129BINOP
TIMES
TEMP t131
CALL
nfactor
BINOP
MINUS TEMP t128 CONST 1
MOVE
TEMP t103
TEMP t129
SEQ
LABEL l2
SEQ
TEMP t130
MOVE
TEMP t130
MOVE
TEMP t128
TEMP t131
SEQ
MOVE
A Theoretical Solution• Apply rewriting rules until convergence
• The result need not be unique
• Efficiency and termination of the compiler
A Practical Solution• Apply rewriting rules in “one” pass
• Two mutually recursive routines– do_stm(s) applies rewritings to s– do_exp(e) applies rewritings to e
• reorder(expRefList) – Returns the side effect statements in expRefList
– Replaces expressions by temporaries
• Code distributed in “cannon.c”
Taming Conditional Brunch• Reorder statements so that CJUMP is
followed by a false label
• Two subphases:– Partition the statement list into basic blocks
(straightline programs starting with a label and ending with a branch)
– Reorder basic blocks (Traces)
Phase 2: Basic Blocks• The compiler does not know which branch
will be taken
• Conservatively analyze the control flow of the program
• A basic block– The first statement is a LABEL– The last statement is JUMP or CJUMP– There are no other LABELs, JUMPs, or
CJUMPs
An Algorithm for Basic BlocksC_basicBlocks()
• Applied for each function body• Scan the statement list from left to right• Whenever a LABEL is found
– a new block begins (and the previous block ends)
• Whenever JUMP or CJUMP are found – the current block ends (and the next block begins)
• When a block ends without JUMP or CJUMP– JUMP to the following LABEL
• When a block does not start with a LABEL– Add a LABEL
• At the end of the function body jump to the beginning of the epilogue
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130))
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME l2)
JUMP(NAME lend)
Traces
• Reorder basic blocks– Every CJUMP is followed by a false label
– Many of the unconditional jumps are followed by the corresponding labels
• can be eliminated
• A trace – a sequence of basic blocks that are executed sequentially
• A program has many overlapping traces• Find a set of traces that exactly covers the program
– every block appears in exactly one trace
• Minimize the number of traces
An Algorithm for Generating Traces C_traceSchedule()''
Put all the blocks of the program into a list Q
while Q is not empty do
Start a new (empty) trace, call it T
Remove the head element b of Q
while b is not marked do
mark b
append b to the end of the current trace T
if there is an unmarked successor c of b
b := c
end of current trace T
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130))
JUMP(NAME l2)
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME lend)
T1
T2
Finishing-Up• CJUMP followed by a false label is left alone• JUMP (NAME l) that is followed by a label l is
removed• CJUMP followed by a true label
– replace true and false labels and negate the condition
• If CJUMP(cond, a, b, lt, lf) is not followed by lt or lf
– Replace by: CJUMP(cond, a, b, lt, l'f)LABEL(l'f)JUMP(NAME lf)
• At the end of the process flat basic blocks (trade simplicity for efficiency of the compiler and of the generated code)
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130))
JUMP(NAME l2)
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME lend)
T1
T2
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130))
JUMP(NAME l2)
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME lend)
LABEL(l3)
CJUMP(EQ, TEMP t128, CONST 0, l0, l1)
LABEL( l0)
MOVE(TEMP t129, CONST 1)
JUMP(NAME l2)
LABEL( l1)
MOVE(TEMP t131, TEMP t128)
MOVE(TEMP t130, CALL(nfactor, BINOP(MINUS, TEMP t128, CONST 1)))
MOVE(TEMP t129, BINOP(TIMES, TEMP t131, TEMP t130))
/* JUMP(NAME l2) */
LABEL( l2)
MOVE(TEMP t103, TEMP t129)
JUMP(NAME lend)
Optimal-Traces• Optimizing compilers locate traces for
frequently executed instructions• Minimize the (dynamic) number of jumps
– Improve instruction cache performance
• Improves register allocation • Optimize loops• Sometimes use
– Static heuristics– Profiling information– Dynamic compilation
prologue statements
JUMP (NAME test)
LABEL (test)
CJ(=, i, N, done, body)
LABEL(body)
loop-body statements
JUMP(NAME(test))
LABEL(done)
epilogue statements
prologue statements
JUMP (NAME test)
LABEL (test)
CJ(<=, i, N, body, done)
LABEL(done)
epilogue statements
LABEL(body)
loop-body statements
JUMP(NAME test)
prologue statements
JUMP (NAME test)
LABEL (body)
loop-body statements
JUMP(NAME TEST)
LABEL(TEST)
CJ(<=, i, N, body, done)
LABEL(done)
epilogue statements
Summary
• Tree like IR saves temporary variables
• But ESEQ nodes may require some temporaries
• Rewriting rules is a powerful mechanism
• Estimating “commuting” statements is a challenge
• Traces may effect the performance of the generated code