syntax of all sorts

53
Syntax of All Sorts COS 441 Princeton University Fall 2004

Upload: abbott

Post on 21-Jan-2016

35 views

Category:

Documents


0 download

DESCRIPTION

Syntax of All Sorts. COS 441 Princeton University Fall 2004. Acknowledgements. Many slides from this lecture have been adapted from John Mitchell’s CS-242 course at Stanford Good source of supplemental information http://theory.stanford.edu/people/jcm/books/cpl-teaching.html - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Syntax of All Sorts

Syntax of All Sorts

COS 441

Princeton University

Fall 2004

Page 2: Syntax of All Sorts

Acknowledgements

Many slides from this lecture have been adapted from John Mitchell’s CS-242 course at Stanford– Good source of supplemental information

http://theory.stanford.edu/people/jcm/books/cpl-teaching.html– Differs in foundational approach when

compared to Harper’s approach will discuss this difference in later lectures

Page 3: Syntax of All Sorts

Interpreter vs Compiler

Source Program

Source Program

Compiler

Input OutputInterpreter

Input OutputTarget

Program

Page 4: Syntax of All Sorts

Interpreter vs Compiler

• The difference is actually a bit more fuzzy

• Some “interpreters” compile to native code– SML/NJ runs native machine code!– Does fancy optimizations too

• Some “compilers” compile to byte-code which is interpreted – javac and a JVM which may also compile

Page 5: Syntax of All Sorts

Interactive vs Batch

• SML/NJ is a native code compiler with an interactive interface

• javac is a batch compiler for bytecode which may be interpreted or compiled to native code by a JVM

• Python compiles to bytecode then interprets the bytecode it has both batch and interactive interfaces

• Terms are historical and misleading– Best to be precise and verbose

Page 6: Syntax of All Sorts

Typical CompilerSource Program

Lexical Analyzer

Syntax Analyzer

Semantic Analyzer

Intermediate Code Generator

Code Optimizer

Code Generator Target Program

Page 7: Syntax of All Sorts

Brief Look at Syntax

• Grammar e ::= n | e + e | e £ e n ::= d | nd d ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

• Expressions in languagee e £ e e £ e + e n £ n + n nd £ d + d dd £ d + d

… 27 £ 4 + 3

Grammar defines a languageExpressions in language derived by sequence of productions

Page 8: Syntax of All Sorts

Parse Tree

• Derivation represented by treee e £ e e £ e + e n £ n + n nd £ d + d dd £ d + d … 27 £ 4 + 3

e

e £

e +

e

e27

4 3

Tree shows parenthesization of expression

Page 9: Syntax of All Sorts

Dealing with Parsing Ambiguity

• Ambiguity– Expression 27 £ 4 + 3 can be parsed two ways– Problem: 27 £ (4 + 3) (27 £ 4) + 3

• Ways to resolve ambiguity– Precedence

• Group £ before + • Parse 3 £ 4 + 2 as (3 £ 4) + 2

– Associativity• Parenthesize operators of equal precedence to left (or right)• Parse 3 + 4 + 5 as (3 + 4) + 5

Page 10: Syntax of All Sorts

Rewriting the Grammar

• Harper describe a way to rewrite the grammar to avoid ambiguity

• Eliminating ambiguity in a parsing grammar is a dark art– Not always even possible– Depends on the parsing technique you use– Learn all you want to know and more in a compiler

book or course

• Ambiguity is bad when trying to think formally about programming languages

Page 11: Syntax of All Sorts

Abstract Syntax Trees

• We want to reason about expressions inductively– So we need an inductive description of

expressions

• We could use the parse tree– But it has all this silly term and factor stuff to

parsing unambiguous

• Introduce nicer tree after the parsing phase

Page 12: Syntax of All Sorts

Syntax Comparisons

Digits d ::= 0 | … | 9

Numbers n ::= d | n d

Expressions e ::= n

| e1 + e2

| e1 £ e2

Numbers n 2 N

Expressions e ::= num[n]

| plus(e1,e2)

| times(e1,e2)

Digits d ::= 0 | … | 9

Numbers n ::= d | n d

Expressions e ::= t | t + e

Terms t ::= f | f £ t

Factors f ::= n | (e)

Abstract

Ambiguous Concrete

Unambiguous Concrete

Page 13: Syntax of All Sorts

Induction Over Syntax

• We can think of the CFG notation as defining a predicate exp

• Thinking about things this way lets us use rule induction on abstract syntax

times(E1,E2)expE1 exp E2 exp

times-eplus(E1,E2)exp

E1 exp E2 expplus-e

num[N] expN 2 N

num-e

Page 14: Syntax of All Sorts

A More General Pattern

• One might ask what is the general pattern of predicates defined by CFG that represent abstract syntax– represents a signature that assigns arities

to operators such a num[N], plus, and times

– Note the T1 … Tn are all unique

OP(T1,…,Tn) term

T1 term Tn term(OP) = n

Page 15: Syntax of All Sorts

An Example

succ(X)natX nat

Szero nat

Z

Page 16: Syntax of All Sorts

An Example

succ(X)natX nat

Szero nat

Z

Operator Arity

zero 0

succ 1

Page 17: Syntax of All Sorts

An Example

succ(X)natX nat

Szero nat

Z

n ::= zero

| succ(n)

Operator Arity

zero 0

succ 1

Page 18: Syntax of All Sorts

Another Example

succ(X)oddX even

S-O

zero evenZ-E

succ(X)evenX odd

S-E

Page 19: Syntax of All Sorts

Another Example

succ(X)oddX even

S-O

zero evenZ-E

succ(X)evenX odd

S-E

Operator Arity

zero 0

succ 1

even ::= zero

| succ(odd)

odd ::= succ(even)

Page 20: Syntax of All Sorts

Another Example

succ(X)oddX even

S-O

zero evenZ-E

succ(X)evenX odd

S-E

Operator Arity

zero 0

succ 1

Operator Arity

succ 1

even ::= zero

| succ(odd)

odd ::= succ(even)

Page 21: Syntax of All Sorts

NOT Abstract Syntax

eq(X,X) equal

A(X,succ(Y),succ(Z))A(X,Y,Z)

F(succ(succ(N)),Z)F(succ(N),X) F(N,Y) A(X,Y,Z)

Page 22: Syntax of All Sorts

Abstract Syntax To ML

n ::= zero

| succ(n)

n 2 N

e ::= num[n]

| plus(e1,e2)

| times(e1,e2)

Page 23: Syntax of All Sorts

Abstract Syntax To ML

n ::= zero

| succ(n)

n 2 N

e ::= num[n]

| plus(e1,e2)

| times(e1,e2)

datatype nat = zero | succ of nat

Page 24: Syntax of All Sorts

Abstract Syntax To ML

n ::= zero

| succ(n)

n 2 N

e ::= num[n]

| plus(e1,e2)

| times(e1,e2)

datatype exp = num of nat | plus of (exp * exp)| times of (exp * exp)

datatype nat = zero | succ of nat

Page 25: Syntax of All Sorts

Abstract Syntax To ML

n ::= zero

| succ(n)

n 2 N

e ::= num[n]

| plus(e1,e2)

| times(e1,e2)

datatype exp = Num of int | Plus of (exp * exp)| Times of (exp * exp)

Capitalize by convention

datatype nat = Zero | Succ of nat

A small cheat for performance

Page 26: Syntax of All Sorts

Abstract Syntax To ML

• Converting things is not always straightforward– Constructor names need to be distinct

• ML lacks some features that would make some things easier– Operator overloading – Interacts badly with type-inference

• Subtyping would improve things too– We can get by with predicates sometimes instead

Page 27: Syntax of All Sorts

Abstract Syntax To ML (cont.)

n ::= zero

| succ(n)

datatype nat = Zero | Succ of nat

even ::= zero

| succ(odd)

odd ::= succ(even)

Page 28: Syntax of All Sorts

Abstract Syntax To ML (cont.)

n ::= zero

| succ(n)

datatype nat = Zero | Succ of nat

even ::= zero

| succ(odd)

odd ::= succ(even)

datatype even = EvenZero| EvenSucc of (odd)and odd = OddSucc of even

Page 29: Syntax of All Sorts

Abstract Syntax To ML (cont.)

n ::= zero

| succ(n)

datatype nat = Zero | Succ of nat

fun even(Zero) = true|even(Succ(n))= odd(n)and odd(Zero) = false |odd(Succ(n)) = even(n)

even ::= zero

| succ(odd)

odd ::= succ(even)

Page 30: Syntax of All Sorts

Syntax Trees Are Not Enough

• Programming languages include binding constructs– variables function arguments, variable function

declarations, type declarations

• When reasoning about binding constructs we want to ignore irrelevant details – e.g. the actual “spelling” of the variable– we only that variables are the same or different

• Still we must be clear about the scope of a variable definition

Page 31: Syntax of All Sorts

Abstract Binding Trees

• Harper uses abstract binding trees (Chapter 5)– This solution is based on very recent work

M. J. Gabbay and A.M. Pitts, A New Approach to Abstract Syntax with Variable Binding, Formal Aspects of Computing 13(2002)341-363

• The solution is new but the problems are old!Alonzo Church. A set of postulates for the foundation of logic. The

Annals of Mathematics, Second Series, Volume 33, Issue 2 (Apr. 1932), 346-366.

• We will talk about the problem with binding today and deal with the solutions in the next lecture

Page 32: Syntax of All Sorts

Lambda Calculus

• Formal system with three parts– Notation for function expressions– Proof system for equations– Calculation rules called reduction

• Basic syntactic notions– Free and bound variables– Illustrates some ideas about scope of binding

• Symbolic evaluation useful for discussing programs

Page 33: Syntax of All Sorts

History

• Original intention– Formal theory of substitution

• More successful for computable functions– Substitution ! symbolic computation– Church/Turing thesis

• Influenced design of Lisp, ML, other languages

• Important part of CS history and theory

Page 34: Syntax of All Sorts

Expressions and Functions

• Expressionsx + y x + 2 £ y + z

• Functionsx. (x + y) z. (x + 2 £ y + z)

• Application(x. (x + y)) 3 = 3 + y(z. (x + 2 £ y + z)) 5 = x + 2 £ y + 5

Parsing: x. f (f x) = x.( f (f (x)) )

Page 35: Syntax of All Sorts

Free and Bound Variables

• Bound variable is “placeholder”– Variable x is bound in x. (x+y) – Function x. (x+y) is same function as z. (z+y)

• Compare x+y dx = z+y dz x P(x) = z P(z)

• Name of free (=unbound) variable does matter– Variable y is free in x. (x+y) – Function x. (x+y) is not same as x. (x+z)

• Occurrences– y is free and bound in x. ((y. y+2) x) + y

Page 36: Syntax of All Sorts

Reduction

• Basic computation rule is -reduction

(x. e1) e2 [xÃe2]e1

where substitution involves renaming as

needed

• Example:

(f. x. f (f x)) (y. y+x)

Page 37: Syntax of All Sorts

Rename Bound Variables

• Rename bound variables to avoid conflicts

(f. z. f (f z)) (y. y+x)

z. (z+x)+x

• Substitute “blindly”

(f. x. f (f x)) (y. y+x)

x. (x+x)+x

Page 38: Syntax of All Sorts

Reduction

(f. x. f (f x)) (y. y+x)

Page 39: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

Rename bound “x” to “z” to avoid conflict with free “x”

Page 40: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

[fÃ(y. y+x)](z. f (f z))

Page 41: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

z. (y. y+x) ((y. y+x) z))

Page 42: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

z. (y. y+x) ((y. y+x) z))

z. (y. y+x) ([yÃz](y+x))

Page 43: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

z. (y. y+x) ((y. y+x) z))

z. (y. y+x) (z+x)

Page 44: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

z. (y. y+x) ((y. y+x) z))

z. (y. y+x) (z+x)

z. [yÃ(z+x)](y+x)

Page 45: Syntax of All Sorts

Reduction

(f. z. f (f z)) (y. y+x)

z. (y. y+x) ((y. y+x) z))

z. (y. y+x) (z+x)

z. ((z+x)+x)

Page 46: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

Page 47: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

[fÃ(y. y+x)](x. f (f x))

Page 48: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

x. (y. y+x) ((y. y+x) x))

Page 49: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

x. (y. y+x) ((y. y+x) x))

x. (y. y+x) ([yÃx](y+x))

Page 50: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

x. (y. y+x) ((y. y+x) x))

x. (y. y+x) (x+x)

Page 51: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

x. (y. y+x) ((y. y+x) x))

x. (y. y+x) (x+x)

x. [yÃ(x+x)](y+x)

Page 52: Syntax of All Sorts

Incorrect Reduction

(f. x. f (f x)) (y. y+x)

x. (y. y+x) ((y. y+x) x))

x. (y. y+x) (x+x)

x. ((x+x)+x)

Page 53: Syntax of All Sorts

Main Points About -calculus

• captures “essence” of variable binding– Function parameters– Bound variables can be renamed

• Succinct function expressions

• Simple symbolic evaluator via substitution

• Easy rule: always rename bound variables to be distinct