d goforth cosc 31271 translating high level languages note error in assignment 1: #4 - refer to...
TRANSCRIPT
D Goforth COSC 3127 1
Translating High Level Languages
Note error in assignment 1:
#4 - refer to Example grammar 3.4, p. 126
D Goforth COSC 3127 2
Stages of translation
Lexical analysis - the lexer or scanner Syntactic analysis - the parser Code generation LinkingBefore Execution
D Goforth COSC 3127 3
Lexical analysis
Translate stream of characters into lexemes
Lexemes belong to categories called tokens
Token identity of lexemes is used at the next stage of syntactic analysis
D Goforth COSC 3127 4
From characters to lexemes
yVal = x + 450 – min ( 100, 4xVal ));
yVal = x + 450 – min ( 100, 4xVal ));
D Goforth COSC 3127 5
Examples: tokens and lexemes
Some token categories contain only one lexeme:
semi-colon ; Some tokens categorize many
lexemes:identifier count, maxCost,…
based on a rule for legal identifier strings
D Goforth COSC 3127 6
Tokens and Lexemes
yVal = x + 450 – min ( 100, 4xVal ));
Lexical analysis
•identifies lexemes and their token type
•recognizes illegal lexemes (4xVal)
•does NOT identify syntax error: ) )
identifierille
gal
lexemeleft_parenequal_sign
D Goforth COSC 3127 7
Syntax or Grammar of Language
rules for generating (used by programmer) or Recognizing (used by parser)a valid sequence of lexemes
D Goforth COSC 3127 8
Grammars
4 categories of grammars (Chomsky) Two categories are important in
computing: Regular expressions (pattern
matching) Context-free grammars
(programming languages)
D Goforth COSC 3127 9
Context-free grammar Meta-language for describing
languages States rules or productions for what
lexeme sequences are correct in the language
Written in Backus-Naur Form (BNF) or EBNF Syntax graphs
D Goforth COSC 3127 10
Example of BNF rule
PROBLEM: how to recognize all these as correct?
y = x
f = rVec.length + 1
button[4].label = “Exit”
RULE for defining assignment statement:
<assign> <variable> = <expression>
Assumes other rules for <variable>, <expression>
D Goforth COSC 3127 11
BNF rules
Non-terminal and terminal symbols: Non-terminals are defined by at least
one rule<assignment> < var> = <expression> Terminals are tokens (or lexemes)
D Goforth COSC 3127 12
Simple sample grammar(p.123)
<assign> <id> = <expr>
<id> A | B | C // lexical
<expr> <id> + <expr>
| <id> * <expr>
| ( <expr>)
| <id> terminals
<nonterminals>
terminals
<nonterminals>
D Goforth COSC 3127 13
Simple sample production<assign> <id> = <expr> <- apply one rule at each step
B = <expr> to leftmost non-terminal
B = <id> * <expr>
B = A * <expr>
B = A * ( <expr> )
B = A * ( <id> + <expr> )
B = A * ( C + <expr> )
B = A * ( C + <id> )
B = A * ( C + C )<assign> <id> = <expr>
<id> A | B | C
<expr> <id> + <expr>
| <id> * <expr>
| ( <expr>)
| <id>
<assign> <id> = <expr>
<id> A | B | C
<expr> <id> + <expr>
| <id> * <expr>
| ( <expr>)
| <id>
D Goforth COSC 3127 14
Sample parse tree<assign>
<expr><id>
=
+
* <expr>B <id>
A <expr>( )
<expr><id>
<id>C
C
Leaves represent the sentence of lexemes
Ru
le a
pp
licatio
n
<assign> <id> = <expr>
<id> A | B | C
<expr> <id> + <expr>
| <id> * <expr>
| ( <expr>)
| <id>
<assign> <id> = <expr>
<id> A | B | C
<expr> <id> + <expr>
| <id> * <expr>
| ( <expr>)
| <id>
D Goforth COSC 3127 15
extended sample grammar<stmt> <assign> | <ifstmt>
<ifstmt> if (<cond>) then <stmt>
| if (<cond>) then <stmt> else <stmt>
<cond> <expr> <compareop><expr>
<compareop> < | > | <= | >= | == | ~=
How to add compound condition?
D Goforth COSC 3127 16
Ambiguous grammar
Different parse trees for same sentence
Different translations for same sentence
Different machine code for same source code!
D Goforth COSC 3127 17
Grammars for ‘human’ conventions without ambiguity Putting features of languages into
grammars: expression any length: lists, p. 121 precedence - an extra non-terminal:
p. 125 associativity - order in recursive rules:
p. 128 nested if statements - “dangling
else” problem: p. 130
D Goforth COSC 3127 18
Forms for grammars Backus-Naur form (BNF) Extended Backus-Naur form (EBNF)
-shortens set of rules Syntax graphs
-easier to read for learning language
D Goforth COSC 3127 19
EBNF optional zero or one occurrence [..] <expr> -> [ <expr> + ] <term> optional zero or more occurrences {..}<expr> -> <term> { + <term> } ‘or’ choice of alternative symbols |<term> -> <term> [ (*|/) <term> ]
Syntax Graph - basic structures
expr term
term factor
factor*
/
expr term
term+
-
factor*
/termterm
BNF (p. 121) EBNF
Syntax Graph
<expr> -> <expr>+<term>
| <expr>-<term>
| <term>
<term> -> <term>*<factor>
| <term>/<factor>
| <factor>
<expr> -> [<expr> (+|-)] <term>
<term> -> [<term> (*|/)] <factor>
<expr> -> <term> {(+|-) <term>}
<term> -> <factor> {(*|/)<factor>}
expr term
term+
-
term factor
factor*
/
D Goforth COSC 3127 22
Attribute grammars Problem: context-free grammars cannot
describe some features needed in programming - “static semantics”e.g.: rules for using data types
*Can’t assign real to integer(clumsy in BNF)
*Can’t access variable before assigning (impossible in BNF)
D Goforth COSC 3127 23
Attributes Symbols in the grammar can have
attributes (properties) Productions can have functions of
some of the attributes of their symbols that compute the attributes of other symbols
Predicates (boolean functions) inspect the attributes of non-terminals to see if they are legitimate
D Goforth COSC 3127 24
Using attributes
1) Apply productions to create parse tree (symbols have some intrinsic attributes)
2) Apply functions to determine remaining attributes
3) Apply predicates to test correctness of parse tree
D Goforth COSC 3127 25
Sebesta’s example
<assign> <var> = <expr><expr> <var> + <var>
| <var><var> A | B | C
Add attributes for type checkingExpected_typeActual_type
D Goforth COSC 3127 26
Sebesta’s example
<assign> <var> = <expr>
<expr> <var> + <var> | <var>
<var> A | B | C
expected_type
actual_type
expected_type
actual_type
expected_type
actual_type
expected_type
actual_type
D Goforth COSC 3127 27
Sebesta’s example
<assign> <var> = <expr>
<expr> <var> + <var> | <var>
<var> A | B | C
actual_typeDetermined from string (A,B,C)
Which has been declared
actual_typeDetermined from string (A,B,C)
Which has been declared
D Goforth COSC 3127 28
Sebesta’s example
<assign> <var> = <expr>
<expr> <var> + <var> | <var>
<var> A | B | C
actual_typeDetermined from <var>
Actual types
actual_typeDetermined from <var>
Actual types
D Goforth COSC 3127 29
Sebesta’s example
<assign> <var> = <expr>
<expr> <var> + <var> | <var>
<var> A | B | C
expected typeDetermined from <var>
Actual types
expected typeDetermined from <var>
Actual types
D Goforth COSC 3127 30
Sebesta’s type rules p.138
D Goforth COSC 3127 31
Sebesta’s example
D Goforth COSC 3127 32
Sebesta’s example
D Goforth COSC 3127 33
Axiomatic semantics
Assertions about statements Preconditions Postconditions
like JUnit testing Purpose
Define meaning of statement Test for validity of computation (does it
do what it is supposed to do?)
D Goforth COSC 3127 34
Example for assignment
What the statement should do is expressed as a postcondition
Based on the syntax of the assignment, a precondition is inferred
When statement is executed, conditions can be verified before and after
D Goforth COSC 3127 35
Example assignment statement
y = 25 + x * 2 postcondition: y>40
y>4025+x*2>40x*2>15x>7.5 precondition