1 comp 3438 – part ii-lecture 1: overview of compiler design dr. zili shao department of computing...
TRANSCRIPT
1
COMP 3438 – Part II-Lecture 1: Overview of Compiler Design
Dr. Zili Shao
Department of Computing
The Hong Kong Polytechnic Univ.
2
Overview of the Subject (COMP 3438)
Overview of Unix Sys. Prog.
Process File System
Overview of Device Driver Development
Character Device Driver Development
Introduction to Block
Device Driver
Overview of Complier Design
Lexical Analysis(HW #3)
Syntax Analysis(HW #4)
Part I: Unix System Programming (Device Driver Development)
Part II: Compiler Design
Course Organization (This lecture is in red)
Outline
Programming language: High-level vs. Low level
What is a compiler? Phases of a compiler
3
Programming language – Machine Language
Machine languagesEverything is a binary number
Operations, data, addresses, …
e.g. In MIPS 2000,
0010 0100 1010 0110 0000 0000 0000 0100
# $t5 + 4 $t6
Machines like it BUT not us 4
Programming language – Assembly Language
Assembly languages Symbolic representation of Machine Language e.g.
Machine Code:
0010 0100 1010 0110 0000 0000 0000 0100
# $t5 + 4 $t6
Assembly Code: add $t6, $t5, 4
5
High-level Programming language
High-level languagesProcedural (modular) programming
Group instructions into meaningful abstractions, e.g., data types, control structures, functions, etc.
C, Pascal, Perl
Object oriented programmingGroup “data” and “methods” into “objects”Naturally represents the world around usC++, Java, JavaScript
Logical programming: PrologFunctional programming: ML
6
Why High-level Languages?
Hide unnecessary details, so have a higher level of abstraction, increasing productivity
Make programs more robust, e.g., meaning of information is specified before its use, enabling substantial error checking at compile time
Make programs more portable
7
Compilers are Translators
C/C++
Fortran
Java
Perl
Matlab
Natural Language
Command
Machine code
Virtual Machine Code
Transformed code
(C, Java, …)
Lower level commands
Semantic components
…….
Translate
8
Translation Mechanisms
Compilation To translate a source program in one language into an
executable program in another language, and produce results while executing the new program
Examples: C, C++, Fortune Interpretation
To read a source program and produce results while understanding that program
Examples: Basic Case Study: Java
First, translate to java bytecode (compilation) Second, execute by interpretation (JVM)/compilation (JIT
(Just-In-Time))
9
Comparison of Compiler/Interpreter
Compiler Interpreter
Overview
Advantages Fast program execution;
Fully exploit architecture features;
Easy to debug;
Flexible to modify;
Machine independent;
Disadvantages Pre-processing of program;
Complicated;
Execution overhead;
SourceCode
Com
piler
ObjectCode
Data ResultsSourceCode
Data
Interpreter Results
10
What is a compiler?
A compiler is a software that takes a program written in one language (called the source language) and translates it into an equivalent program in another language (called the target language).
It also reports to its user the presence of errors in the source program.
CompilerSourceprogram
Targetprogram
Error messages
11
The Phases of a Compiler
Source program
Lexical Analyzer
Syntax Analyzer (Parser)
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator
Target program
Symbol-table Manager
Error Handler
12
Scan the source program and group sequences of characters into tokens.
A token is the smallest element of a language a group of characters (e.g., a series of alphabetic characters
forms a keyword; a series of digits forms a number).
The sub-module of the compiler that performs lexical analysis is called a lexical analyzer.
Example: position := initial + rate * 60 (pascal statement)
Lexical Analysis
Value Toke Type Value Toke Type position ID rate ID:= Operator * Operatorinitial ID 60 NUM
13
Once the tokens are identified, syntax analysis groups sequence of tokens into language constructs
e.g., identifiers, numbers, and operators can be grouped into expressions.
e.g., keywords, identifiers, expressions and operators can be combined to form statements.
The sub-module of the compiler that performs syntax analysis is called the parser/ Syntax Analyzer.
Syntax Analysis
14
Result of syntax analysis is recorded in a hierarchical structure called a syntax tree, each node represents an operation and its children represent
the arguments of the operation. evaluation begins from bottom and moves up. e.g., parse tree for postion := initial + rate * 60
Syntax Analysis – Syntax (Parse) Tree
=
id1 +
id2 *
id3 NUM (60)15
Semantic Analysis
Determine the meaning using the syntax treePut semantic meaning into the syntax treePerform checks to ensure that components fit together
meaningful, e.g. Type checking=
id1 +
id2 *
id3
NUM (60)
inttoreal
16
Intermediate Code Generation
Generate IR (Intermediate Representation) code
temp1 := inttoreal(60)temp2 := id3*temp1 temp3 := id2+temp2id1 := temp3
Easier to generate machine code from IR code
=
id1 +
id2 *
id3
NUM (60)
inttoreal
17
Code Optimization: Modify program representation so that program can run faster, use less memory, power, …
IR Code Optimized Code
Code Optimization
temp1 := inttoreal(60)temp2 := id3*temp1 temp3 := id2+temp2id1 := temp3
temp1 := id3* 60.0 id1 := id2+temp1
18
Code Generation
Generate target program.Machine Code
temp1 := id3* 60.0 id1 := id2+temp1
MOVF id3, R2MULF #60.0, R2MOVF id2, R1ADDF R2, R1MOVF R1, id1
19
Symbol Table Management
Collect and maintain information about IDAttributes:
Storage: where to store (Data, Heap, Stack, …)
Type: char, int, pointer, …
Scope: effective range
Number: value Information is added and used by all phases
Debuggers use symbol table
20
Front End and Back End
Source program
Lexical Analyzer
Syntax Analyzer (Parser)
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator
Target program
Symbol-table Manager
Error Handler
Front End
Back End
21
Distinction between Phases and Passes
Passes: the times going through a program representation1-pass, 2-pass, multiple-pass compilationLanguage become more complex – more passes
Phases: conceptual stagesNot completely separate
Semantic phase may do things that syntax should do
22
Compiler Tools
Phases Tools
Lexical Analysis Lex, flex
Syntax Analysis yacc, bison
Semantic Analysis
Intermediate Code
Code Optimization
Code Generation
23