introduction (chapter 1) 1 course overview part i: overview material 1introduction 2language...
TRANSCRIPT
1Introduction (Chapter 1)
Course Overview
PART I: overview material1 Introduction
2 Language processors (tombstone diagrams, bootstrapping)
3 Architecture of a compiler
PART II: inside the compiler4 Syntax analysis
5 Contextual analysis
6 Runtime organization
7 Code generation
PART III: conclusion8 Interpretation
9 Review
2Introduction (Chapter 1)
Chapter 1: Introduction
GOAL this lecture:What is this course about... a high-level perspective.
OVERVIEW– Levels of programming languages
– Language processors
– Specification of a programming language
3Introduction (Chapter 1)
Levels of Programming Languages
High-level program class Triangle { ... float area() { return b*h/2; }
class Triangle { ... float area() { return b*h/2; }
Low-level program LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET
LOAD r1,bLOAD r2,hMUL r1,r2DIV r1,#2RET
Executable Machine code 0001001001000101001001001110110010101101001...
0001001001000101001001001110110010101101001...
4Introduction (Chapter 1)
Levels of Programming Languages
Some high-level languages:– C, C++, Java, Pascal, Ada, Fortran, Cobol, Scheme, Prolog,
Smalltalk, ...
Some low-level languages:– x86 assembly language, PowerPC assembly language, SPARC
assembly language, MIPS assembly language, ARM assembly language, ...
5Introduction (Chapter 1)
Levels of Programming Languages
What makes a high-level language different from a low-level language?
Things found in HL languages but typically not in LL languages- Expressions- control structures/abstractions:
while, repeat-until, if-then-elseprocedures
- data types- distinguish several different types of data- composite data types- user defined data types
- encapsulationmodules, procedures, objects
6Introduction (Chapter 1)
Abstraction
A high-level language is more abstract than a low-level language.
More abstract? What does that mean?
Abstraction: Separate the ‘how’ from the ‘what’.
Or what is implemented from how is it implemented.
e.g. procedural abstraction = separate ‘what does it do’ from ‘how does it do it’
HL languages abstract away from the underlying machine => much more portable
7Introduction (Chapter 1)
Levels of Programming Languages
Q: How do the following make a HL language more abstract?
- Expressions- control structures:
while, repeat-until, if-then-elseprocedures
- data types- encapsulation
modules, procedures, objects
8Introduction (Chapter 1)
Language Processors: What are they?
A programming language processor is any system (software or hardware) that manipulates programs.
A programming language processor is any system (software or hardware) that manipulates programs.
Examples:– Editors– Translators (e.g. compiler, assembler, disassembler)– Interpreters
9Introduction (Chapter 1)
Language Processors: Why do we need them?
Hardware
Programmer
X86 ProcessorX86 Processor
JVM Binary codeJVM Binary code
JVM Assembly codeJVM Assembly code
Java ProgramJava Program
JVM InterpreterJVM Interpreter
Concepts and IdeasConcepts and Ideas
Hardware
Programmer
How to bridge the“semantic gap” ?
Compute surface area ofa triangle?
0101001001...
10Introduction (Chapter 1)
Programming Language Specification
• Why?– A communication device between people who need to have a
common understanding of the PL:
• language designer, language implementer, user
• What to specify?– Specify what is a ‘well formed’ program
• syntax
• contextual constraints (also called static semantics):
– scoping rules
– type rules
– Specify what is the meaning of (well formed) programs
• semantics (also called runtime semantics)
11Introduction (Chapter 1)
Programming Language Specification
• Why?• What to specify?• How to specify ?
– Formal specification: use some kind of precisely defined formalism
– Informal specification: description in English.
– Usually a mix of both (e.g. Java specification)
• Syntax => formal specification using CFG/BNF
• Contextual constraints and semantics => informal
12Introduction (Chapter 1)
Syntax Specification
Syntax is specified using “Context Free Grammars”:– A finite set of terminal symbols– A finite set of non-terminal symbols– A start symbol– A finite set of production rules
Usually CFG are written in “Bachus Naur Form” or BNF notation.
A production rule in BNF notation is written as:N ::= where N is a non terminal
and a sequence of terminals and non-terminals N ::= is an abbreviation for several rules
with N as left-hand side.
13Introduction (Chapter 1)
Syntax Specification
A CFG defines a set of strings. This is called the language of the CFG.
Example:Start ::= Letter | Start Letter | Start DigitLetter ::= a | b | c | d | ... | zDigit ::= 0 | 1 | 2 | ... | 9
Q: What is the “language” defined by this grammar?
14Introduction (Chapter 1)
Example: Syntax of “Mini-Triangle”
Mini-Triangle is a very simple Pascal-like programming language.
An example program:
!This is a comment.let const m ~ 7; var n: Integerin begin n := 2 * m * m ; putint(n) end
!This is a comment.let const m ~ 7; var n: Integerin begin n := 2 * m * m ; putint(n) end
Declarations
Command
Expression
15Introduction (Chapter 1)
Example: Syntax of “Mini-Triangle”
Program ::= single-Commandsingle-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command...
Program ::= single-Commandsingle-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command...
16Introduction (Chapter 1)
Example: Syntax of “Mini-Triangle” (continued)
Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= IdentifierIdentifier ::= Letter | Identifier Letter | Identifier DigitInteger-Literal ::= Digit | Integer-Literal DigitOperator ::= + | - | * | / | < | > | =
Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= IdentifierIdentifier ::= Letter | Identifier Letter | Identifier DigitInteger-Literal ::= Digit | Integer-Literal DigitOperator ::= + | - | * | / | < | > | =
17Introduction (Chapter 1)
Example: Syntax of “Mini-Triangle” (continued)
Declaration ::= single-Declaration | Declaration ; single-Declarationsingle-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoterType-denoter ::= Identifier
Declaration ::= single-Declaration | Declaration ; single-Declarationsingle-Declaration ::= const Identifier ~ Expression | var Identifier : Type-denoterType-denoter ::= Identifier
Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLine | Empty Graphic ::= any printable character or space
Comment ::= ! CommentLine eolCommentLine ::= Graphic CommentLine | Empty Graphic ::= any printable character or space
18Introduction (Chapter 1)
Syntax Trees
A syntax tree or parse tree is an ordered labeled tree such that:a) terminal nodes (leaf nodes) are labeled by terminal symbols
b) non-terminal nodes (internal nodes) are labeled by non-terminal symbols.
c) each non-terminal node labeled by N has children X1, X2, ... Xn (in this order) such that N := X1 X2 ... Xn is a production.
19Introduction (Chapter 1)
Syntax Trees
Example:
Expression
Expression
V-name
primary-Exp.
Expression
Ident
d +
primary-Exp
Op Int-Lit
10 *
Op
V-name
primary-Exp.
Ident
n
Expression := Expression Op primary-Exp1 2 3
1
2
3
20Introduction (Chapter 1)
Concrete and Abstract Syntax
The previous grammar specified the concrete syntax of Mini-Triangle.
The concrete syntax is important for the programmer who needs to know exactly how to write syntactically well-formed programs.
The abstract syntax omits irrelevant syntactic details and only specifies the essential structure of programs.
Example: different concrete syntaxes for an assignmentv := e (set! v e)e -> vv = e
21Introduction (Chapter 1)
Example: Concrete/Abstract Syntax of Commands
single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command
single-Command ::= V-name := Expression | Identifier ( Expression ) | if Expression then single-Command else single-Command | while Expression do single-Command | let Declaration in single-Command | begin Command endCommand ::= single-Command | Command ; single-Command
Concrete Syntax
22Introduction (Chapter 1)
Example: Concrete/Abstract Syntax of Commands
Command ::= V-name := Expression AssignCmd | Identifier ( Expression ) CallCmd | if Expression then Command else Command IfCmd | while Expression do Command WhileCmd | let Declaration in Command LetCmd | Command ; Command SequentialCmd
Command ::= V-name := Expression AssignCmd | Identifier ( Expression ) CallCmd | if Expression then Command else Command IfCmd | while Expression do Command WhileCmd | let Declaration in Command LetCmd | Command ; Command SequentialCmd
Abstract Syntax
23Introduction (Chapter 1)
Example: Concrete Syntax of Expressions
Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= Identifier
Expression ::= primary-Expression | Expression Operator primary-Expressionprimary-Expression ::= Integer-Literal | V-name | Operator primary-Expression | ( Expression ) V-name ::= Identifier
24Introduction (Chapter 1)
Example: Abstract Syntax of Expressions
Expression ::= Integer-Literal IntegerExp | V-name VNameExp | Operator ExpressionUnaryExp | Expression Op ExpressionBinaryExpV-name ::= Identifier SimpleVName
Expression ::= Integer-Literal IntegerExp | V-name VNameExp | Operator ExpressionUnaryExp | Expression Op ExpressionBinaryExpV-name ::= Identifier SimpleVName
25Introduction (Chapter 1)
Abstract Syntax Trees
Abstract Syntax Tree for: d:=d+10*n
BinaryExpression
VNameExp
BinaryExpression
Ident
d +
Op Int-Lit
10 *
Op
SimpleVName
IntegerExp VNameExp
Ident
n
SimpleVName
AssignmentCmd
d
Ident
VName
SimpleVName
26Introduction (Chapter 1)
Contextual Constraints
Syntax rules alone are not enough to specify the format of well-formed programs.
Example 1:let const m~2in putint(m + x)
Example 2:let const m~2 ; var n:Booleanin begin n := m<4; n := n+1end
Undefined! Scope Rules
Type error! Type Rules
27Introduction (Chapter 1)
Scope Rules
Scope rules regulate visibility of identifiers. They relate every applied occurrence of an identifier to a binding occurrence.Example 1let const m~2; var r:Integerin r := 10*m
Binding occurrence
Applied occurrence
Terminology:
Static binding vs. dynamic binding
Example 2let const m~2in putint (m + x)
?
28Introduction (Chapter 1)
Type Rules
Type rules regulate the expected types of arguments and types of returned values for the operations of a language.
Examples
Terminology:
Static typing vs. dynamic typing
Type rule of < : E1 < E2 is type-correct and has type Boolean ifE1 and E2 are both type-correct and have type Integer
Type rule of while: while E do C is type-correct ifE is type-correct and has type Boolean and C is type-correct
29Introduction (Chapter 1)
Semantics
Specification of semantics is concerned with specifying the “meaning” of well-formed programs.
Terminology:
Expressions are evaluated and yield values (and may or may not perform side effects).
Commands are executed and perform side effects.
Declarations are elaborated to produce bindings.
Side effects:• change the values of variables• perform input/output
30Introduction (Chapter 1)
Semantics
Example: The (informally specified) semantics of commands in Mini-Triangle.
Commands are executed to update variables and/or to perform input/output.
The assignment command V := E is executed as follows:
first the expression E is evaluated to yield a value x
then x is assigned to the variable named V
The sequential command C1;C2 is executed as follows:
first the command C1 is executed
then the command C2 is executed
etc.
31Introduction (Chapter 1)
Semantics
Example: The semantics of expressions.
An expression is evaluated to yield a value.
An (integer literal) expression IL yields the integer value of IL
The (variable or constant name) expression V yields the value of the variable or constant named V
The (binary operation) expression E1 O E2 yields the value obtained by applying the binary operation O to the values yielded by (the evaluation of) expressions E1 and E2
etc.
32Introduction (Chapter 1)
Semantics
Example: The semantics of declarations.
A declaration is elaborated to produce bindings. It may also have the side effect of allocating (memory for) variables.
The constant declaration const I~E is elaborated by binding the identifier value I to the value yielded by E.
The variable declaration var I:T is elaborated by binding I to a newly allocated variable, whose initial value is undefined. The variable will be de-allocated upon exit from the let-command that contains the declaration.
The sequential declaration D1;D2 is elaborated by elaborating D1 followed by D2 and combining the bindings produced by both. D2 is elaborated in the environment of the sequential declaration overlaid by the bindings produced by D1.
33Introduction (Chapter 1)
Conclusion / Summary
• This course is about compilers• Compilers are language processors
– translate high-level language into low-level language
– help bridge the semantic gap
• Language specification– needed for communication between language designers,
implementers, and users
– Three “parts” we will study during this course
• Syntax of the language: usually formal: Extended BNF
• Contextual constraints: usually informal: scope rules and type rules (written in English)
• Semantics: usually informal: descriptions in English