syntax of all sorts
Post on 21-Jan-2016
35 Views
Preview:
DESCRIPTION
TRANSCRIPT
Syntax of All Sorts
COS 441
Princeton University
Fall 2004
Acknowledgements
Many slides from this lecture have been adapted from John Mitchell’s CS-242 course at Stanford– Good source of supplemental information
http://theory.stanford.edu/people/jcm/books/cpl-teaching.html– Differs in foundational approach when
compared to Harper’s approach will discuss this difference in later lectures
Interpreter vs Compiler
Source Program
Source Program
Compiler
Input OutputInterpreter
Input OutputTarget
Program
Interpreter vs Compiler
• The difference is actually a bit more fuzzy
• Some “interpreters” compile to native code– SML/NJ runs native machine code!– Does fancy optimizations too
• Some “compilers” compile to byte-code which is interpreted – javac and a JVM which may also compile
Interactive vs Batch
• SML/NJ is a native code compiler with an interactive interface
• javac is a batch compiler for bytecode which may be interpreted or compiled to native code by a JVM
• Python compiles to bytecode then interprets the bytecode it has both batch and interactive interfaces
• Terms are historical and misleading– Best to be precise and verbose
Typical CompilerSource Program
Lexical Analyzer
Syntax Analyzer
Semantic Analyzer
Intermediate Code Generator
Code Optimizer
Code Generator Target Program
Brief Look at Syntax
• Grammar e ::= n | e + e | e £ e n ::= d | nd d ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
• Expressions in languagee e £ e e £ e + e n £ n + n nd £ d + d dd £ d + d
… 27 £ 4 + 3
Grammar defines a languageExpressions in language derived by sequence of productions
Parse Tree
• Derivation represented by treee e £ e e £ e + e n £ n + n nd £ d + d dd £ d + d … 27 £ 4 + 3
e
e £
e +
e
e27
4 3
Tree shows parenthesization of expression
Dealing with Parsing Ambiguity
• Ambiguity– Expression 27 £ 4 + 3 can be parsed two ways– Problem: 27 £ (4 + 3) (27 £ 4) + 3
• Ways to resolve ambiguity– Precedence
• Group £ before + • Parse 3 £ 4 + 2 as (3 £ 4) + 2
– Associativity• Parenthesize operators of equal precedence to left (or right)• Parse 3 + 4 + 5 as (3 + 4) + 5
Rewriting the Grammar
• Harper describe a way to rewrite the grammar to avoid ambiguity
• Eliminating ambiguity in a parsing grammar is a dark art– Not always even possible– Depends on the parsing technique you use– Learn all you want to know and more in a compiler
book or course
• Ambiguity is bad when trying to think formally about programming languages
Abstract Syntax Trees
• We want to reason about expressions inductively– So we need an inductive description of
expressions
• We could use the parse tree– But it has all this silly term and factor stuff to
parsing unambiguous
• Introduce nicer tree after the parsing phase
Syntax Comparisons
Digits d ::= 0 | … | 9
Numbers n ::= d | n d
Expressions e ::= n
| e1 + e2
| e1 £ e2
Numbers n 2 N
Expressions e ::= num[n]
| plus(e1,e2)
| times(e1,e2)
Digits d ::= 0 | … | 9
Numbers n ::= d | n d
Expressions e ::= t | t + e
Terms t ::= f | f £ t
Factors f ::= n | (e)
Abstract
Ambiguous Concrete
Unambiguous Concrete
Induction Over Syntax
• We can think of the CFG notation as defining a predicate exp
• Thinking about things this way lets us use rule induction on abstract syntax
times(E1,E2)expE1 exp E2 exp
times-eplus(E1,E2)exp
E1 exp E2 expplus-e
num[N] expN 2 N
num-e
A More General Pattern
• One might ask what is the general pattern of predicates defined by CFG that represent abstract syntax– represents a signature that assigns arities
to operators such a num[N], plus, and times
– Note the T1 … Tn are all unique
OP(T1,…,Tn) term
T1 term Tn term(OP) = n
An Example
succ(X)natX nat
Szero nat
Z
An Example
succ(X)natX nat
Szero nat
Z
Operator Arity
zero 0
succ 1
An Example
succ(X)natX nat
Szero nat
Z
n ::= zero
| succ(n)
Operator Arity
zero 0
succ 1
Another Example
succ(X)oddX even
S-O
zero evenZ-E
succ(X)evenX odd
S-E
Another Example
succ(X)oddX even
S-O
zero evenZ-E
succ(X)evenX odd
S-E
Operator Arity
zero 0
succ 1
even ::= zero
| succ(odd)
odd ::= succ(even)
Another Example
succ(X)oddX even
S-O
zero evenZ-E
succ(X)evenX odd
S-E
Operator Arity
zero 0
succ 1
Operator Arity
succ 1
even ::= zero
| succ(odd)
odd ::= succ(even)
NOT Abstract Syntax
eq(X,X) equal
A(X,succ(Y),succ(Z))A(X,Y,Z)
F(succ(succ(N)),Z)F(succ(N),X) F(N,Y) A(X,Y,Z)
Abstract Syntax To ML
n ::= zero
| succ(n)
n 2 N
e ::= num[n]
| plus(e1,e2)
| times(e1,e2)
Abstract Syntax To ML
n ::= zero
| succ(n)
n 2 N
e ::= num[n]
| plus(e1,e2)
| times(e1,e2)
datatype nat = zero | succ of nat
Abstract Syntax To ML
n ::= zero
| succ(n)
n 2 N
e ::= num[n]
| plus(e1,e2)
| times(e1,e2)
datatype exp = num of nat | plus of (exp * exp)| times of (exp * exp)
datatype nat = zero | succ of nat
Abstract Syntax To ML
n ::= zero
| succ(n)
n 2 N
e ::= num[n]
| plus(e1,e2)
| times(e1,e2)
datatype exp = Num of int | Plus of (exp * exp)| Times of (exp * exp)
Capitalize by convention
datatype nat = Zero | Succ of nat
A small cheat for performance
Abstract Syntax To ML
• Converting things is not always straightforward– Constructor names need to be distinct
• ML lacks some features that would make some things easier– Operator overloading – Interacts badly with type-inference
• Subtyping would improve things too– We can get by with predicates sometimes instead
Abstract Syntax To ML (cont.)
n ::= zero
| succ(n)
datatype nat = Zero | Succ of nat
even ::= zero
| succ(odd)
odd ::= succ(even)
Abstract Syntax To ML (cont.)
n ::= zero
| succ(n)
datatype nat = Zero | Succ of nat
even ::= zero
| succ(odd)
odd ::= succ(even)
datatype even = EvenZero| EvenSucc of (odd)and odd = OddSucc of even
Abstract Syntax To ML (cont.)
n ::= zero
| succ(n)
datatype nat = Zero | Succ of nat
fun even(Zero) = true|even(Succ(n))= odd(n)and odd(Zero) = false |odd(Succ(n)) = even(n)
even ::= zero
| succ(odd)
odd ::= succ(even)
Syntax Trees Are Not Enough
• Programming languages include binding constructs– variables function arguments, variable function
declarations, type declarations
• When reasoning about binding constructs we want to ignore irrelevant details – e.g. the actual “spelling” of the variable– we only that variables are the same or different
• Still we must be clear about the scope of a variable definition
Abstract Binding Trees
• Harper uses abstract binding trees (Chapter 5)– This solution is based on very recent work
M. J. Gabbay and A.M. Pitts, A New Approach to Abstract Syntax with Variable Binding, Formal Aspects of Computing 13(2002)341-363
• The solution is new but the problems are old!Alonzo Church. A set of postulates for the foundation of logic. The
Annals of Mathematics, Second Series, Volume 33, Issue 2 (Apr. 1932), 346-366.
• We will talk about the problem with binding today and deal with the solutions in the next lecture
Lambda Calculus
• Formal system with three parts– Notation for function expressions– Proof system for equations– Calculation rules called reduction
• Basic syntactic notions– Free and bound variables– Illustrates some ideas about scope of binding
• Symbolic evaluation useful for discussing programs
History
• Original intention– Formal theory of substitution
• More successful for computable functions– Substitution ! symbolic computation– Church/Turing thesis
• Influenced design of Lisp, ML, other languages
• Important part of CS history and theory
Expressions and Functions
• Expressionsx + y x + 2 £ y + z
• Functionsx. (x + y) z. (x + 2 £ y + z)
• Application(x. (x + y)) 3 = 3 + y(z. (x + 2 £ y + z)) 5 = x + 2 £ y + 5
Parsing: x. f (f x) = x.( f (f (x)) )
Free and Bound Variables
• Bound variable is “placeholder”– Variable x is bound in x. (x+y) – Function x. (x+y) is same function as z. (z+y)
• Compare x+y dx = z+y dz x P(x) = z P(z)
• Name of free (=unbound) variable does matter– Variable y is free in x. (x+y) – Function x. (x+y) is not same as x. (x+z)
• Occurrences– y is free and bound in x. ((y. y+2) x) + y
Reduction
• Basic computation rule is -reduction
(x. e1) e2 [xÃe2]e1
where substitution involves renaming as
needed
• Example:
(f. x. f (f x)) (y. y+x)
Rename Bound Variables
• Rename bound variables to avoid conflicts
(f. z. f (f z)) (y. y+x)
…
z. (z+x)+x
• Substitute “blindly”
(f. x. f (f x)) (y. y+x)
…
x. (x+x)+x
Reduction
(f. x. f (f x)) (y. y+x)
Reduction
(f. z. f (f z)) (y. y+x)
Rename bound “x” to “z” to avoid conflict with free “x”
Reduction
(f. z. f (f z)) (y. y+x)
[fÃ(y. y+x)](z. f (f z))
Reduction
(f. z. f (f z)) (y. y+x)
z. (y. y+x) ((y. y+x) z))
Reduction
(f. z. f (f z)) (y. y+x)
z. (y. y+x) ((y. y+x) z))
z. (y. y+x) ([yÃz](y+x))
Reduction
(f. z. f (f z)) (y. y+x)
z. (y. y+x) ((y. y+x) z))
z. (y. y+x) (z+x)
Reduction
(f. z. f (f z)) (y. y+x)
z. (y. y+x) ((y. y+x) z))
z. (y. y+x) (z+x)
z. [yÃ(z+x)](y+x)
Reduction
(f. z. f (f z)) (y. y+x)
z. (y. y+x) ((y. y+x) z))
z. (y. y+x) (z+x)
z. ((z+x)+x)
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
[fÃ(y. y+x)](x. f (f x))
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
x. (y. y+x) ((y. y+x) x))
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
x. (y. y+x) ((y. y+x) x))
x. (y. y+x) ([yÃx](y+x))
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
x. (y. y+x) ((y. y+x) x))
x. (y. y+x) (x+x)
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
x. (y. y+x) ((y. y+x) x))
x. (y. y+x) (x+x)
x. [yÃ(x+x)](y+x)
Incorrect Reduction
(f. x. f (f x)) (y. y+x)
x. (y. y+x) ((y. y+x) x))
x. (y. y+x) (x+x)
x. ((x+x)+x)
Main Points About -calculus
• captures “essence” of variable binding– Function parameters– Bound variables can be renamed
• Succinct function expressions
• Simple symbolic evaluator via substitution
• Easy rule: always rename bound variables to be distinct
top related