1 terminology l statement ( 敘述 ) »declaration, assignment containing expression ( 運算式 ) l...

39
1 Terminology Statement ( 敘敘 ) » declaration, assignment containing expression ( 敘敘敘 ) Grammar ( 敘敘 ) » a set of rules specify the form of legal statements Syntax ( 敘敘 ) vs. Semantics ( 敘敘 ) » Example: assuming J,K:integer and X,Y:float » I:=J+K vs I:=X+Y Compilation: 敘敘 » matching statements written by the programmer to st ructures defined by the grammar and generating the appropriate object code

Post on 21-Dec-2015

222 views

Category:

Documents


2 download

TRANSCRIPT

1

Terminology

Statement ( 敘述 )» declaration, assignment containing expression ( 運算式 )

Grammar ( 文法 )» a set of rules specify the form of legal statements

Syntax ( 語法 ) vs. Semantics ( 語意 )» Example: assuming J,K:integer and X,Y:float» I:=J+K vs I:=X+Y

Compilation: 編譯 » matching statements written by the programmer to structure

s defined by the grammar and generating the appropriate object code

2

System Software

Assembler Loader and Linker Macro Processor Compiler Operating System Other System Software

» RDBS» Text Editors» Interactive Debugging System

3

Basic Compiler

Lexical analysis -- scanner» scanning the source statement, recognizing and classifying

the various tokens

Syntactic analysis -- parser» recognizing the statement as some language construct

Code generation --

4

Scanner

PROGRAM

STATS

VAR

SUM

,

SUMSQ

,

I

SUM

:=

0

;

SUMSQ

:=

READ

(

VALUE

)

;

5

Parser

Grammar: a set of rules » Backus-Naur Form (BNF) » Ex: Figure 5.2

Terminology» Define symbol ::=» Nonterminal symbols <>» Alternative symbols |» Terminal symbols

6

Simplified Pascal Grammar

7

Parser

<read> ::= READ (<id-list>) <id-list>::= id | <id-list>,id

<assign>::= id := <exp> <exp> ::= <term> |

<exp>+<term> | <exp>-<term>

<term>::=<factor> | <term>*<factor> | <term> DIV <factor>

<factor>::= id | int | <exp>

READ(VALUE)

SUM := 0

SUM := SUM + VALUE

MEAN := SUM DIV 100

8

Syntax Tree

9

Syntax Tree for Program 5.1

10

Lexical Analysis

Function» scanning the program to be compiled and recognizing the tokens that make up the source statements

Tokens» Tokens can be keywords, operators, identifiers, integers, flo

ating-point numbers, character strings, etc.» Each token is usually represented by some fixed-length cod

e, such as an integer, rather than as a variable-length character string (see Figure 5.5)

» Token type, Token specifier (value) (see Figure 5.6)

11

Scanner Output

Token specifier» identifier name, integer value

Token coding scheme» Figure 5.5

12

Example - Figure 5.6

Statement Token type Token specifierPROGRAM STATS 1

22 ^STATSVAR 2SUM,SUMSQ,I, …, : INTEGER 22 ^SUM

1422 ^SUMSQ

14…..

1422 ^VARIANCE

136

BEGIN 3SUM:=0; 22 ^SUM

1523 #0

13

Token Recognizer

By grammar» <ident> ::= <letter> | <ident> <letter>| <ident><digit>» <letter> ::= A | B | C | D | … | Z» <digit> ::= 0 | 1 | 2 | 3 | … | 9

By scanner - modeling as finite automata» Figure 5.8(a)

14

Recognizing Identifier

Identifiers allowing underscore (_)» Figure 5.8(b)

State A-Z 0-9 _1 22 2 2 33 2 2

1 2A-Z 2

A-Z0-9

3

A-Z0-9

-

15

Recognizing Integer

Allowing leading zeroes» Figure 5.8(c)

Disallowing leading zeroes» Figure 5.8(d)

1 20-9 2

0-9

1 21-9 2

0-9

3

0

4space 4

16

Scanner -- Implementation

Figure 5.10 (a)» Algorithmic code for identifer recognition

Tabular representation of finite automaton for Figure 5.9

State A-Z 0-9 ;,+-*() : = .

1 2 4 5 62 2 2 334 456 77

17

Syntactic Analysis

Recognize source statements as language constructs or build the parse tree for the statements» bottom-up: operator-precedence parsing» top-down:: recursive-descent parsing

18

Operator-Precedence Parsing

Operator» any terminal symbol (or any token)

Precedence» * » +» + « *

Operator-precedence» precedence relations between operators

Precedence Matrix for the Fig. 5.2

20

Operator-Precedence Parse Example

BEGIN READ ( VALUE ) ;

(i) … id1 := id2 DIV

(ii) … id1 := <N1> DIV int -

(iii) … id1 := <N1> DIV <N2> -

(iv) … id1 := <N3> - id3 *

(v) … id1 := <N3> - <N4> * id4 ;

(vi) … id1 := <N3> - <N4> * <N5> ;

(vi) … id1 := <N3> - <N6> ;

(vii) … id1 := <N7> ;

23

Operator-Precedence Parsing

Bottom-up parsing Generating precedence matrix

» Aho et al. (1988)

24

Shift-reduce Parsing with Stack

Figure 5.14

25

Recursive-Descent Parsing

Each nonterminal symbol in the grammar is associated with a procedure

<read> ::= READ (<id-list>) <stmt> ::= <assign> | <read> | <write> | <for> Left recursion

» <dec-list> ::= <dec> | <dec-list>;<dec>

Modification» <dec-list> ::= <dec> {;<dec>}

26

27

Recursive-Descent Parse of READ

28

Simplified Pascal Grammar for Recursive-Descent Parser

29

30

31

32

33

34

35

36

37

38

39