compilers and optimizations · development tool chain (now) user codifies program in high-level...
TRANSCRIPT
Compilers and Optimizations NNGU,VMK 2004
2
Course introduction
• Oriented on self education– key documents– self experience on working with large
complex product– Contact me! ([email protected])
• Expected outcome – you would be able to…– understand common compiler’s architecture– study and work with GNU tools– analyze and fix compiler issues– implement experimental parts for GCC
Compilers and Optimizations NNGU,VMK 2004
4
Why compilers?
• Compiler is tool for translation of high-level human-oriented description of program into binaries which hardware can execute and deliver expected results.
• Heart of programming industry– productivity– portability– easy to learn and use
Compilers and Optimizations NNGU,VMK 2004
5
Content
Development tool chain• Compiler overview
– GNU compiler collection• Compiler internals
Compilers and Optimizations NNGU,VMK 2004
6
Development tool chain (aged)user codifies program intarget machine codes
I/O Devices: put program to fixed memory toggle switches, punched cards readers
hardware conventions:how to start program
Target Hardware:run program from fixed address
Compilers and Optimizations NNGU,VMK 2004
7
Development tool chain (past)
user codifies program intarget assembly (textual!)
Target Hardware:run program as OS conformingexecutable
Assembler:translate assembly to object filewith target machine codes
Linker:linking objects to executable,resolving references, adjusting binary for execution
multiple object files with target machine codes
executable file for targetmachine and OS environment
OS environment!- I/O and file system- Task execution- Execution conventions
Compilers and Optimizations NNGU,VMK 2004
8
Development tool chain (now)
user codifies program inhigh-level language
Target Hardware:run program as OS conformingexecutable
Assembler:translate assembly to object filewith target machine codes
Linker:linking objects to executable,resolving references, adjusting binary for execution
multiple object files with target machine codes
executable file for targetmachine and OS environment
Compiler:translate high-level languageto assembly, object files or binary
asm
binary file for execution
assembly language
obj
exe
Compilers and Optimizations NNGU,VMK 2004
9
GCC tool chain• GNU compiler collection - “yellow” tools
– C, C++, ObjectiveC, Fortran, Ada, Java, etc• Binutils package – “green tools”
– BFD, linker: ld, assembler: as, gas– other useful utilities (dumping, etc)
• Target platform: hardware/OS – “blue”– numerous configuration
(http://gcc.gnu.org/install/specific.html)– x86: aix-x86, linux-x86, beos-x86,
mingw32-x86, etc
Compilers and Optimizations NNGU,VMK 2004
10
Content
Development tool chainCompiler overview- GNU compiler collection
• Compiler internals
Compilers and Optimizations NNGU,VMK 2004
11
Compiler Glossary
• IR – intermediate representation• AST – abstract syntax tree• FE - frontend• BE - backend
Compilers and Optimizations NNGU,VMK 2004
12
Compilers• “Source to source” compilers
– HPF: convert Fortran to parallel Fortran– Pascal to C compilers
• “Source to binary” compilers– Most recently used (Visual Studio, Intel Compilers,
GCC)– Native, cross, “canadian” (build, host, target)– Output: target binary or assembler listing
• “Binary to source” compilers?– Known as “decompilers” ;-)
• Binary optimizers– Profile guided
Compilers and Optimizations NNGU,VMK 2004
13
Compiler components“Source to source” compiler
Driver Module
Frontend
GLUE (FE IR to BE’s machine-independent IR converter)
Optimizer phases (machine-independent)
Code Generator: output new sources
Compilers and Optimizations NNGU,VMK 2004
14
Compiler components“Source to target binary” compiler
Driver Module
Frontend
GLUE (FE IR to BE’s machine-independent IR converter)
Optimizer phases (machine-independent)
GLUE (BE machine-independent IR to BE machine-oriented IR )
Code Generator
Optimizer phases (machine-oriented)
Compilers and Optimizations NNGU,VMK 2004
15
Driver component
• Preliminary work– parsing of command line arguments– setup compilation environment– invoke following compiler components
• Features– Could be built-in (Microsoft)– Could be separate program (Intel, GCC)
Compilers and Optimizations NNGU,VMK 2004
16
Frontend
• Parse source to high-level IR (intermediate representation)– lexical and syntax analysis – diagnostics– machine independent (often)– could be generated from description (grammar)
• GLUE– generator or converter to another IR or even to target
executable codes– could be built-in to FE (EDG FE for Java)
Compilers and Optimizations NNGU,VMK 2004
17
Optimizer (machine-independent)
• Works over machine-independent IR– classic optimizations
(const/copy propagation, dead code elimination, peepholes, etc …)
– optimizations are beneficial in common sense,state and value safe
– could use SSA form– loop optimizations
Compilers and Optimizations NNGU,VMK 2004
18
Optimizer (machine-oriented)
• GLUE: converts IR to machine-oriented language– Simplify code generation (could be target
instructions even)• Works over machine-oriented IR
– use target machine model– classic optimizations– pattern driven optimizations
Compilers and Optimizations NNGU,VMK 2004
19
Code Generator
• Machine code specific optimizations– could be coupled with machine-oriented
optimizer– target machine model– register allocation– instruction scheduling– emission of executable codes or assembler
listings
Compilers and Optimizations NNGU,VMK 2004
20
Content
Development tool chainCompiler overview- GNU compiler collection
• Compiler internals
Compilers and Optimizations NNGU,VMK 2004
21
GCC Specific Glossary• AST – abstract syntax tree
– IR of GCC frontends, several dialects• GENERIC – IR for frontends
– common for all languages– semantics should be preserved
• GIMPLE – IR for machine-independent optimizers– high-level abstraction of program
• RTL – register transfer language– IR for GCC portable backend
Compilers and Optimizations NNGU,VMK 2004
22
IR choices for optimizations - past
• AST+ complete control, data and type information+ suitable for high-level source transformations– each frontend has its own AST-syntax– AST contains expressions with side effects,
semantic constructions • RTL
– is not suited for high-level transformations.– too many target features– lost original data type information and control structures– addressing modes have replaced variable references+ easy to exploit basic optimizations in portable way
Compilers and Optimizations NNGU,VMK 2004
23
IR choices for optimizations - future
• GENERIC+ universal IR for keeping various languages
• GIMPLE+ language/target independent+ complete full type information preserved+ no implicit side-effects
Compilers and Optimizations NNGU,VMK 2004
24
Compilers and Optimizations NNGU,VMK 2004
25
Compilers and Optimizations NNGU,VMK 2004
26
Frontends in GCC
• Generated from description (grammar)– GNU Flex and Bison
• Numerous supported languages– C/C++, Fortran 77/95, Ada, Java, …
• IR for frontends– up to 3.4.x: AST dialects– from 3.5.x: GENERIC - common
Compilers and Optimizations NNGU,VMK 2004
27
AST example:;; Function absf (absf)
absf (x)@1 function_decl name: @2 type: @3 srcp: t1.c:1
args: @4 extern body: @5 @2 identifier_node strg: absf lngt: 4 @3 function_type size: @6 algn: 8 retn: @7
prms: @8 @4 parm_decl name: @9 type: @7 scpe: @1
srcp: t1.c:1 argt: @7 size: @10 algn: 32 used: 1
@5 bind_expr type: @11 vars: @12 body: @13 @6 integer_cst type: @14 low : 8 @7 real_type name: @15 size: @10 algn: 32
prec: 32 @8 tree_list valu: @7 chan: @16 @9 identifier_node strg: x lngt: 1 @10 integer_cst type: @14 low : 32 @11 void_type name: @17 algn: 8 @12 var_decl name: @18 type: @7 scpe: @1
srcp: t1.c:2 artificial size: @10 algn: 32 used: 1
@13 statement_list 0 : @19 1 : @20 2 : @21 @14 integer_type name: @22 unql: @23 size: @24
algn: 64 prec: 36 unsigned min : @25 max : @26
... and much more below
Compilers and Optimizations NNGU,VMK 2004
28
GENERIC example:
-ftree-dump-allfloat absf (float x) {return x<0?-x:x;
}
absf (x){float T.0;float iftmp.1;
if (x < 0.0){
iftmp.1 = -x;}
else{
iftmp.1 = x;}
T.0 = iftmp.1;return T.0;
}
As for GCC 3.5.x, GCC internals document:
“The C and C++ front ends currently convert directly from frontend trees to GIMPLE, and hand that off to the back end rather than first converting to GENERIC.”
Compilers and Optimizations NNGU,VMK 2004
29
Backends in GCC
• Machine-independent optimizer (starting from 3.5.x)– GIMPLE– SSA
• Portable backend – working on RTL– machine model and optimization patterns
• Separated assembler output backend
Compilers and Optimizations NNGU,VMK 2004
30
Merging blocks 3 and 4absf (x){
float iftmp.1;float T.0;
# BLOCK 0# PRED: ENTRY (fallthru)if (x < 0.0) goto <L0>; else goto <L1>;# SUCC: 2 (false) 1 (true)
# BLOCK 1# PRED: 0 (true)
<L0>:;iftmp.1 = -x;goto <bb 3> (<L2>);# SUCC: 3 (fallthru)
# BLOCK 2# PRED: 0 (false)
<L1>:;iftmp.1 = x;# SUCC: 3 (fallthru)
# BLOCK 3# PRED: 2 (fallthru) 1 (fallthru)
<L2>:;T.0 = iftmp.1;return T.0;# SUCC: EXIT
}
GIMPLE example:float absf (float x) {return x<0?-x:x;
}
-ftree-dump-cfg
GIMPLE-tree after building CFG.
Compilers and Optimizations NNGU,VMK 2004
31
GIMPLE/SSA example: ;; Function absf (absf)
absf (x){
float iftmp.1;float T.0;
<bb 0>:if (x_2 < 0.0) goto <L0>; else goto <L1>;
<L0>:;iftmp.1_6 = -x_2;goto <bb 3> (<L2>);
<L1>:;iftmp.1_5 = x_2;
# iftmp.1_1 = PHI <iftmp.1_5(2), iftmp.1_6(1)>;<L2>:;
T.0_3 = iftmp.1_1;return T.0_3;
}
float absf (float x) {return x<0?-x:x;
}
-ftree-dump-ssa
In blue color –SSA temps and Phi-nodes
Compilers and Optimizations NNGU,VMK 2004
32
RTL example:;; Function absf
(note 2 0 5 NOTE_INSN_DELETED)(note 5 2 3 0 [bb 0] NOTE_INSN_BASIC_BLOCK)(note 3 5 4 0 NOTE_INSN_FUNCTION_BEG)(note 4 3 6 0 NOTE_INSN_DELETED)(note 6 4 8 1 [bb 1] NOTE_INSN_BASIC_BLOCK)(insn 8 6 9 1 (set (reg:SF 61)
(mem/i:SF (reg/f:SI 53 virtual-incoming-args) [0 x+0 S4 A32])) -1 (nil)(nil))
(insn 9 8 10 1 (set (reg:SF 62)(mem/u/i:SF (symbol_ref/u:SI ("*LC0") [flags 0x2]) [0 S4 A32])) -1 (nil)
(expr_list:REG_EQUAL (const_double:SF 0 [0x0] 0.0 [0x0.0p+0])(nil)))
(jump_insn 10 9 37 1 (parallel [(set (pc)
(if_then_else (gt (reg:SF 62)(reg:SF 61))
(label_ref 13)(pc)))
(clobber (reg:CCFP 18 fpsr))(clobber (reg:CCFP 17 flags))(clobber (scratch:HI))
]) -1 (nil)(nil))
Compilers and Optimizations NNGU,VMK 2004
33
Machine-model: Lisp-like descriptions
// description of CPU units(define_cpu_unit "pentium-u,pentium-v" "pentium")
// description of issue logic(define_reservation "pentium-np" "(pentium-u + pentium-v)")(define_reservation "pentium-uv" "(pentium-u | pentium-v)")
// description of instruction(define_insn_reservation "pent_mul" 11 (and (eq_attr "cpu" "pentium") (eq_attr "type" "imul"))
"pentium-np*11")
Compilers and Optimizations NNGU,VMK 2004
34
Machine model:pattern-driven optimizations
;; Convert imul by three, five and nine into lea;; imul eax,3 => lea eax,[eax+2*eax](define_peephole2[(parallel
[(set (match_operand:SI 0 "register_operand" "")(mult:SI (match_operand:SI 1 "register_operand" "")
(match_operand:SI 2 "const_int_operand" "")))(clobber (reg:CC FLAGS_REG))])]
"INTVAL (operands[2]) == 3|| INTVAL (operands[2]) == 5|| INTVAL (operands[2]) == 9"[(set (match_dup 0)
(plus:SI (mult:SI (match_dup 1) (match_dup 2))(match_dup 1)))]
{ operands[2] = GEN_INT (INTVAL (operands[2]) - 1); })
Compilers and Optimizations NNGU,VMK 2004
35
Content
Development tool chainCompiler overviewCompiler internals
Compilers and Optimizations NNGU,VMK 2004
36
Intermediate data
• CFG – control flow graph– nodes represent calculations (basic blocks)– edges represent execution flow– basic block – sequence of instructions
• for each possible flow starts from first instruction• for each possible flow ends on last instruction
• SYMTAB – symbol table– program indetificators (types, varaibles) and
their attributes
Compilers and Optimizations NNGU,VMK 2004
37
Merging blocks 3 and 4absf (x){
float iftmp.1;float T.0;
# BLOCK 0# PRED: ENTRY (fallthru)if (x < 0.0) goto <L0>; else goto <L1>;# SUCC: 2 (false) 1 (true)
# BLOCK 1# PRED: 0 (true)
<L0>:;iftmp.1 = -x;goto <bb 3> (<L2>);# SUCC: 3 (fallthru)
# BLOCK 2# PRED: 0 (false)
<L1>:;iftmp.1 = x;# SUCC: 3 (fallthru)
# BLOCK 3# PRED: 2 (fallthru) 1 (fallthru)
<L2>:;T.0 = iftmp.1;return T.0;# SUCC: EXIT
}
CFG example:float absf (float x) {return x<0?-x:x;
}
-ftree-dump-cfg
GIMPLE-tree after building CFG.
Compilers and Optimizations NNGU,VMK 2004
38
Data Structures: Trees
• Tree is most often used data type– syntax tree, dominators tree– binary (usually)
=
a +
*
b
c
-
*
b
c
-
Syntax tree for “a = b*(-c) + b*(-c)”
Compilers and Optimizations NNGU,VMK 2004
39
Data Structures: DAG• DAG – directed acyclic graph
– nodes of common subexpressionshas several parents
– compact, simplified code generation
a + a*(b-c) + (b-c)*d i = i + 5
+
+ *
*
a -
b c
d
=
i
+
5
Compilers and Optimizations NNGU,VMK 2004
40
Data Structures: Three-address code
• Three-address code – linear representation of tree or dag with explicit names for nodes
• Types:– A = B op C, op is binary operator– X = op Y , op is unary operator– X = Y– GOTO Label– IF X relop Y GOTO Label
Compilers and Optimizations NNGU,VMK 2004
41
Data Structures: Three-address codeExample: a = b*(-c) + b*(-c)
=
a +
*
b
c
-
*
b
c
-
Syntax tree DAG=
a +
*
b
c
-
t1 = - ct2 = b * t1t3 = - ct4 = b * t3t5 = t2 + t4a = t5
three-address codefor DAG
three-address codefor syntax tree t1 = - c
t2 = b * t1t3 = t2 + t2a = t3
Compilers and Optimizations NNGU,VMK 2004
42
Intermediate RepresentationPostfix notation
• Formal definition, P(E)=postfix notation:– If E is const|varaible Then P(E)=E– If E is “E1 op E2”, op is binary operator
Then P(E) = “E1’E2’op”, where E1’=P(E1), E2’=P(E2)
– If E=(E’) then P(E)=P(E’)• Naïve definition:
– sequence and number of arguments for operators accept only one decoding way
– So act as stack evaluator
Compilers and Optimizations NNGU,VMK 2004
43
Postfix notation: example
• Could also simplify code generation• Example 1:
(9-5)+2 => 95-2+• Example 2:
9-(5+2) => 952+-
Compilers and Optimizations NNGU,VMK 2004
44
Intermediate RepresentationSSA
• Static Single Assignment form– each variable is assigned only once– new versions for multiple assignments of the
same variable (temps)– Phi-functions to solve ambiguities at the
places where several definitions arrive• Issues with aggregates
– example: access to one element of arraydoesn’t change all array
Compilers and Optimizations NNGU,VMK 2004
45
SSA: exampleint clip(int x) {
int r;if(x<0)
r=0;else
r=x*x;return r;
}
clip (x){
int r;int T.0;
<bb 0>:if (x_2 < 0) goto <L0>; else goto <L1>;
<L0>:;r_6 = 0;goto <bb 3> (<L2>);
<L1>:;r_5 = x_2 * x_2;
# r_1 = PHI <r_5(2), r_6(1)>;<L2>:;
T.0_3 = r_1;return T.0_3;
}
-fdump-tree-ssa
Compilers and Optimizations NNGU,VMK 2004
46
Why SSA?
• Sparse representation of DEF/USE chains– compact, efficient algorithms for building
• Data flow analysis is much easy– explicit solution to several simple problem:
is this definition dead?• Robust implementation of classic
optimizations– Partial Redundancy Elimination– Constant propagation, Dead code elimination
Compilers and Optimizations NNGU,VMK 2004
47
Data structures: Graphs• Graph is another most often used data type
– Control Flow Graph, Data Dependence graph– algorithms
• Domination and dominator tree is used to build SSA form – it’s there to place Phi-nodes– bi back dominates bj if bi is on every path from entry to bj– bi forward dominates bj if bi is on every path from bj to all exists– strict domination is domination than bi != bj– dominance frontier for bi is set of all bj such that bi dominates
predecessor of bj but don’t strictly dominates bj– for each variable V place phi-functions into each node which is in
dominance frontier of V of each node which assigns V including assignments thru Phi-functions
Compilers and Optimizations NNGU,VMK 2004
48
Graph example (VCG format):-davfloat absf (float x) {
return x<0?-x:x;}
graph: { title: "absf"node: { title: "ENTRY" label: "ENTRY" }node: { title: "EXIT" label: "EXIT" }edge: { sourcename: "ENTRY" targetname: "0" linestyle: solid priority: 100 }
node: { title: "0" label: "#0\ncond_expr (2)\ncond_expr (2)"}edge: { sourcename: "0" targetname: "2" priority: 100 linestyle: solid }edge: { sourcename: "0" targetname: "1" priority: 100 linestyle: solid }
node: { title: "1" label: "#1\nlabel_expr (-1)\nmodify_expr (2)"}edge: { sourcename: "1" targetname: "3" priority: 100 linestyle: solid }
node: { title: "2" label: "#2\nlabel_expr (-1)\nmodify_expr (2)"}edge: { sourcename: "2" targetname: "3" priority: 100 linestyle: solid }
node: { title: "3" label: "#3\nlabel_expr (-1)\nreturn_expr (-1)"}edge: { sourcename: "3" targetname: "EXIT" priority: 100 linestyle: solid }}
Compilers and Optimizations NNGU,VMK 2004
49
VCG visualization:float absf (float x) {return x<0?-x:x;
}
-dav
Compilers and Optimizations NNGU,VMK 2004
50
Intermediate RepresentationGNU assembler
.file "t2.c"
.text
.p2align 2,,3.globl _clip
.def _clip; .scl 2; .type 32; .endef_clip:
pushl %ebpmovl %esp, %ebpmovl 8(%ebp), %eaxtestl %eax, %eaxjs L6imull %eax, %eaxleaveret.p2align 2,,3
L6:xorl %eax, %eaxleaveret
Compilers and Optimizations NNGU,VMK 2004
51
Reference
• Slides #21-23: gccint (Passes section), nordu2003.pdf, nordu2003-slides.pdf
• Slide #24: Lex-YACC-HOWTO.pdf,GCC-Frontend-HOWTO.pdf
• Slides #44-46: p451-cytron.pdf• Slides #47: p1-allen.pdf• Slides #48-49: VCG tool on CD• Slide #50: Assembly-HOWTO.pdf