fundamentals of programming languages...

Post on 17-Aug-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Fundamentals of ProgrammingLanguages VIIILambda Calculus and Types

Guoqiang Li

School of Software, Shanghai Jiao Tong University

Assignment

Assignment 4 is announced!

Reports

Tell me your choice before the Friday of 16th week (Dec. 29)

Submit the report before the Friday of 18th week (Jan. 12)

If you submit the report before the Friday of the 16th week, you canget feedback in one week and have the second chance to resubmit!

The guide of the report can be found in the slides of the second lecture.

The History of λ-Calculus

• The λ-calculus was invented by Alonzo Church in 1920s.• In 1960s, Peter Landin observed that a complex programming

language can be understood by formulating it as a tiny corecalculus — λ-calculus.

• Landin’s work and John McCarthy’s Lisp make λ-calculus themost widespread specifications of PL features, in both design andimplementation.

Syntax

• In arithmetic expression, there is no function.• In λ-calculus, everything is a function.

Syntax.

t ::= termsx variableλx.t abstractiont t application

Scope

• x is bound if it occurs in the body t of an abstraction λx.t.• x is free if it is not bound.

(λx.λy.x y x) x

The third x is free, and the first two occurrence of x are bound.• A term is closed if it has no free variables. A closed term is also

called combinators.id = λx.x

Operational Semantics, Informally

(λx.s) t is called a redex (“reducible expression”).

β-reduction(λx.s) t −→ [x 7→ t]s

• Full β-reduction: any redex may be reduced at any time.• Normal order strategy: the leftmost, outermost redex is reduced

first.• Call by name: leftmost, outermost redex is reduced first, and no

redex inside abstractions is allowed to reduce.• Call by value: outermost redexes are reduced and only its

argument part has already been reduced to a value.

Syntax, Revisited

Syntax.

t ::= termsx variableλx.t abstractiont t application

v ::= valuesλx.t function value

ExamplesQuiz. Find all the redex of this term:

id (id (λz.id z))1.id (id (λz.id z)) 2.id (id (λz.id z)) 3.id (id (λz.id z))

• Full β-reduction allows all these redexes.

id (id (λz.id z)) −→ id (id (λz.z))

• Normal order strategy allows the first one.

id (id (λz.id z)) −→ (id (λz.id z))−→ λz.id z −→ λz.z

• Call-by-name is more restrictive than normal order

id (id (λz.id z)) −→ (id (λz.id z))−→ λz.id z 6−→

• Call by value requires the right-hand side to be a value.

id (id (λz.id z)) −→ id (λz.id z)−→ λz.id z

Which Strategy?

• Full β-reduction is nondeterministic.• Call-by-value strategy is used by most languages.• Call-by-name is sometimes called lazy strategy.• Haskell uses an optimized version of call-by-name, and called

call-by-need.

Terms and Variables

Terms. Let V be a set of variables. The set of terms is the smallest setT such that:

• x ∈ T for every x ∈ V ;• if t1 ∈ T and x ∈ V , then λx.t1 ∈ T• if t1, t2 ∈ T , then t1 t2 ∈ T .

Free variables.

FV(x) = xFV(λx.t1) = FV(t1) \ xFV(t1 t2) = FV(t1) ∪ FV(t2)

SubstitutionCan you find the mistake in this definition of substitution?

[x 7→ s]x = s[x 7→ s]y = y if x 6= y

[x 7→ s]λy.t = λy.([x 7→ s]t)[x 7→ s](t1 t2) = ([x 7→ s]t1 [x 7→ s]t2)

Revised one:

[x 7→ s]x = s[x 7→ s]y = y if x 6= y

[x 7→ s]λx.t = λx.t[x 7→ s]λy.t = λy.([x 7→ s]t) if y 6= x ∧ y 6∈ FV(s)

[x 7→ s](t1 t2) = ([x 7→ s]t1 [x 7→ s]t2)

Convention

Terms that differ only in the names of bound variables areinterchangeable in all contexts.

Substitution, finally

[x 7→ s]x = s[x 7→ s]y = y if x 6= y

[x 7→ s]λy.t = λy.([x 7→ s]t) if y 6= x ∧ y 6∈ FV(s)[x 7→ s](t1 t2) = ([x 7→ s]t1 [x 7→ s]t2)

Operational Semantics, Formally

Call by value.

APP1t1 −→ t′1

t1 t2 −→ t′1 t2

APP2t2 −→ t′2

v1 t2 −→ v1 t′2APPABS

(λx.t) v −→ [x 7→ v] t

Quiz. Please give the evaluation rules for call-by-name and fullλ-calculus, respectively.

Programming in λ-Calculus

Multiple Arguments

• We do not write f = λ(x, y).s, instead, we write f = λx.λy.s.• These two are different things. Informally, the first function takes

a pair and return s. The second one takes an x return a functionwhich will take a y then return s.

• The transformation of multi-argument functions into higher-orderfunctions is called currying, in honor of Haskell Curry.

Church Booleans

tru = λt.λf .tfls = λt.λf .f

Operatorstest = λl.λm.λn.l m nand = λm.λn.m n fls

Quiz.1. Define boolean operators or and not.2. What is the difference between if then else and test we define here?

1. λm.λn.m tru n, λm.m fls tru

Pairs

pair = λf .λs.λb.b f s

Operatorsfst = λp.p tru

snd = λp.p fls

Example.fst (pair v w)−→∗ fst (λb.b v w)−→ (λb.b v w) tru−→ tru v w−→∗ v

Church Numerals

0 = λs.λz.z1 = λs.λz.s z2 = λs.λz.s (s z)3 = λs.λz.s (s (s z))· · ·

• A number n is a function that takes two arguments s and z (succand zero) and applies s, n times to z.

• 0 is syntactically equivalent to fls.• Successor functions: succ = λn.λs.λz.s(n s z)

Operators

n = λs.λz. s · · · (s z) · · · )︸ ︷︷ ︸n times ofs

• plus = λm.λn.λs.λz.m s (n s z)

• times = λm.λn.m (plus n) 0• power = λm.λn.n (times m) 1• iszero = λm.m (λx.fls)tru• pred is quite a bit more difficult than additions.

zz = pair 0 0;ss = λp.pair (snd p) (plus 1 (snd p));

prd = λm.fst(m ss zz);

Tuples

Define 〈M1, ...,Mn〉 = λz.zM1...Mn and the ith projectionπn

1 = λp.p(λx1...xn.xi). Then

πni 〈M1, ...,Mn〉β Mi

for all 1 ≤ i ≤ n.

Lists

Define nil = λxy.y and H :: T = λxy.xHT . Then the function ofadding a list of numbers can be:

addlist l = l(λht.add h(addlist t))(0)

Trees

A binary tree can be either a leaf, labeled by a natural number, or anode with two subtrees. Write leaf(n) for a leaf labeled n, andnode(L,R) for a node with left subtree L and right subtree R.

leaf(n) = λxy.xnnode(L,R) = λxy.yLR

A program that adds all the numbers at the leaves of a tree:

addtree t = t(λn.n)(λlr.add (addtree l)(addtree r))

RecursionCan all terms be evaluate to a normal form?In λ-calculus, no. Here is a diverge term:

Ω = (λx.xx)(λx.xx)

Fix-point combinator, or Y-combinator.Call-by-name version Y = λf .(λx.f (x x)) (λx.f (x x))Call-by-value version Y = λf .(λx.f (λy.x x y)) (λx.f (λy.x x y))

How to define a recursive function?

g = λfact.λx.(if x = 0 then 1 else x ∗ (fact (x − 1)))factorial = Y g

Quiz. Give the reduction of factorial 3.

Nameless Representation of Terms

Overview

• Conventions help us to discuss basic concepts.• In implementations, we need to choose a single representation.

Candidates:1 Renaming bound variables to “fresh” names;

2 Devising some “canonical” representation of variables and termsthat does not require renaming

3 Explicit substitution [Abadi, Cardelli, Curien, and Lvy, 1991]

4 Combinatory logic [Curry and Feys, 1958; Barendregt, 1984]

We will use the formulation based on a well-known technique due toNicolas de Bruijn.

De Bruijn Terms

We can represent terms more straightforwardly by making variableoccurrences point directly to their binders, rather than referring tothem by name.

• λx.x to λ.0• λx.λy.x (y x) to λ.λ.1 (0 1)

TermsLet T be the smallest family of sets T0,T1,T2, · · · such that

• k ∈ Tn whenever 0 ≤ k < n;• if t1 ∈ Tn and n > 0, then λ.t1 ∈ Tn−1;• if t1 ∈ Tn and t2 ∈ Tn, then (t1 t2) ∈ Tn.

The elements of Tn are terms with at most n free variables.

Naming Context

• Suppose we want to represent λx.y x as a nameless term. What’sthe binder for y?

• To deal with terms containing free variables, we need the idea ofnaming context.

Example. Given a naming context

Γ = x 7→ 4, y 7→ 3, z 7→ 2, a 7→ 1, b 7→ 0

• x (y z) encoded into 4 (3 2);• λw.y w encoded into λ.4 0;• λw.λa.x encoded into λ.λ.6.

Definition. A naming context Γ = xn, xn−1, · · · , x1, x0 assigns to eachxi the de Bruijn index i.

Shifting and Substitution

Example.(λb.b (λa.b a)) (λb.b a)−→ [b 7→ (λb.b a)](b (λa.b a))= (λb.b a) (λc.(λb.b a) c)

Nameless representation under Γ = a:

(λ.0 (λ.1 0)) (λ.0 1) −→ [0 7→ (λ.0 1)](0 (λ.1 0)) = (λ.0 1) (λ.(λ.0 2) 0)

In the substitution [j 7→ s]t where t is an abstraction λ.t′,• j needs to be increased in t′

• The free variables in s also need to be increased in thesubstitution applied to t′.

Shifting and Substitution

DefinitionThe d-place shift of a term t about cutoff c, written ↑dc (t) is

defined as

↑dc (k) =

k if k < ck + d if k ≥ c

↑dc (λ.t1) = λ. ↑d

c+1 (t1)↑d

c (t1 t2) = ↑dc (t1) ↑d

c (t2)

DefinitionThe substitution of a term s for variable number j in a termt, written [j 7→ s]t, is defined as follows:

[j 7→ s]k =

s if k = jk otherwise

[j 7→ s](λ.t1) = λ.[j + 1 7→ ↑1 s]t1[j 7→ s](t1 t2) = [j 7→ s]t1 [j 7→ s]t2

Evaluation(λ.t1) t2 −→ ↑−1 [0 7→ ↑1 t2]t1

Example. (λb.w (λa.b a)) (λb.b a)−→ [b 7→ (λb.b a)](b (λa.b a))= w (λc.(λb.b a) c)

Nameless representation under Γ = wa:

(λ.2 (λ.1 0)) (λ.0 1)−→ ↑−1 [0 7→ ↑1 (λ.0 1)](2 (λ.1 0))= ↑−1 [0 7→ (λ.0 2)](2 (λ.1 0))= ↑−1 (2 [0 7→ (λ.0 2)](λ.1 0)))= ↑−1 (2λ.[1 7→ ↑1 (λ.0 2)](1 0)))= ↑−1 (2λ.[1 7→ (λ.0 3)](1 0)))= ↑−1 (2λ.((λ.0 3) 0))= (1λ.((λ.0 2) 0))

Quiz

Quiz. Given Γ = wa, show the de Bruijn notation of(λx.λa.axw)(λx.x) and evaluate it.

More on λ-Calculus

η-reduction

λx.Mx →η M, where x 6∈ FV(M).

Define the single-step βη-reduction→βη=→β ∪ →η and and themulti-step βη-reduction βη.

Church-Rosser Theorem

Theorem(Church and Rosser, 1936). Let denote either β or βη.Suppose M, N and P are lambda terms such that M N and M P.Then there exists a lambda term Z such that N Z and P Z .

This is the Church-Rosser property or confluence.

Summery

• λ-calculus is one of the most important models for computationtheory.

• Call-by-value strategy is used by most programming languages.Call-by-name strategy is also called lazy strategy. λ-calculuswith either strategy is Turing complete.

• De Bruijn notation is a great way to tackle with bound names,and is especially useful in the implementation.

Typed Arithmetic Expressions

By Anonymous

Q: Why bother doing proofs about programming languages? They arealmost always boring if the definitions are right.

A: The definitions are almost always wrong.

Arithmetic expressions

t ::= termstrue constant truefalse constant falseif t then t else t conditional0 constant zerosucc t successorpred t predecessoriszero t zero test

v ::= valuestrue true valuefalse false valuenv numeric value

nv ::= numeric values0 zero valuesucc nv successor value

Types

• Recall that evaluating a term can either result in a value or elseget stuck at some stage, by reaching a term like pred false.

• In fact, we can tell stuck terms without actually evaluating it.• Coming soon: If a term is well typed, i.e., it has some type T ,

then it never get stuck (never goes wrong).

Typing RelationThe typing relation for arithmetic expressions, written “t : T , isdefined by a set of inference rules assigning types to terms.

T ::= typesBool type of booleansNat type of natural numbers

Typing rules:

T-Truetrue : Bool

T-Falsefalse : Bool

T-Zero0 : Nat

T-Succt1 : Nat

succ t1 : NatT-Pred

t1 : Natpred t1 : Nat

T-Ift1 : Bool t2 : T t3 : T

if t1 then t2 else t3 : TT-IsZero

t1 : Natiszero t1 : Bool

Uniqueness of Types

When reasoning about the typing relation, we will often inverse thetyping relation.Lemma

1 If true : R or false : R, then R = Bool;

2 If if t1 then t2 else t3 : R, then t1 : Bool, t2 : R, andt3 : R.

3 If 0 : R, or succ t1 : R, or pred t1 : R, then R = Nat andt1 : Nat.

4 If iszero t1 : R, then R = Bool and t1 : Nat.

Thoerem [Uniqueness of Types]: Each term t has at most one type.Above theorem does not hold for languages with subtyping rules.

Most Basic Property of Type System

Safety = Progress + Preservation

• Progress: A well-typed term is not stuck (either it is a value or itcan take a step according to the evaluation rules).

• Preservation: If a well-typed term takes a step of evaluation, thenthe resulting term is also well typed.

Progress

Lemma [Canonical forms]:• If v is a value of type Bool, then v is either true or false.• If v is a value of type Nat, then v is a numeric value according to

the grammar.

Theorem [Progress]: Suppose t is a well-typed term (that is, t : T forsome T). Then either t is a value or else there is some t′ with t −→ t′.

Proof. By induction on a derivation of t : T.

Preservation

Theorem [Preservation]: If t : T and t −→ t′, then t′ : T.

Proof. Either by induction on a derivation of t : T, or by induction ona derivation of t −→ t′.

Simply Typed λ-Calculus

Arrow TypeFor a function,

1 we care about the types of both arguments and results:

arrow type T→ T

Note the difference between T→ T→ T and (T→ T)→ T2 the type of an abstraction relies on the type of argument, e.g.

λx.x : Bool→ Bool or λx.x : Nat→ Nat

3 Hence, the typing relation on abstractions should be written as

λx : T1 .t2 : T1 → T2

But how do we derive T2? We assume x : T1!!! So we need anenvironment (context) for our typing relation:

Γ ` t : T

Pure Simply Typed λ-Calculus(λ→) Terms t ::= x | λx : T .t | t t

Values v ::= λx : T .t

Types T ::= T→ T

Contexts Γ ::= ∅ | Γ, x : T

Typing

T-Varx : T ∈ ΓΓ ` x : T

T-AbsΓ, x : T1 ` t2 : T2

Γ ` λx : T1 .t2 : T1 → T2

T-AppΓ ` t1 : T1 → T2 Γ ` t2 : T1

Γ ` t1 t2 : T2Quiz: 1. Please draw the type derivation tree of the term(λx : Bool→ Nat.x true)(λx : Bool.if x then 0 else (succ 0)).2. What about this term λx : Bool.x x?

Properties of Typing

Lemma [Inversion of the Typing Relation]:1 If Γ ` x : R, then x : R ∈ Γ.

2 If Γ ` λx : T1 .t2 : R, then R = T1 → R2 for some R2 withΓ, x : T1 ` t2 : R2.

3 If Γ ` t1 t2 : R, then there is some type T1 such that t1 : T1 → Rand Γ ` t2 : T1.

Theorem [Uniqueness of types]: In a given typing context Γ, a termt (with free variables all in the domain of Γ) has at most one type.

Progress

Lemma [Canonical forms]:• If v is a value of type Bool, then v is either true or false.• If v is a value of type T1 → T2, then v = λx : T1.t2.

Theorem [Progress]: Suppose t is a closed, well-typed term (that is,∅ ` t : T for some T). Then either t is a value or else there is some t′

with t −→ t′.

Preservation

Theorem [Preservation]: If Γ ` t : T and t −→ t′, then Γ ` t′ : T.Quiz. Try to prove above theorem and figure out what lemmas weneed.

Lemma [Permutation]: If Γ ` t : T and ∆ is a permutation of Γ,then ∆ ` t : T. Moreover, the latter derivation has the same depth asthe former.

Theorem [Weakening]: If Γ ` t : T and x 6∈ dom(Γ), thenΓ, x : S ` t : T. Moreover, the latter derivation has the same depth asthe former.

Theorem [Preservation of types under substitution]: IfΓ, x : S ` t : T and Γ ` s : S, then Γ ` [x 7→ s]t : T.

Erasure and Typability

Type annotations are be used during type checking, and will be erasedbefore evaluation.

Definition [Erasure]: The erasure of a simply typed term t is definedas follows:

erase(x) = x

erase(λx : T1.t2) = λx.erase(t2)

erase(t1 t2) = erase(t1)erase(t2)

Definition [Typability]: A term m in the untyped λ-calculus is said tobe typable in λ→ if there are some simply typed term t, type T, andcontext Γ such that erase(t) = m and Γ ` t : T.

Summary

• Typing system Γ ` t : T can remove some terms before they runinto stuck states.

• However, it also removes well-behaviored terms.• Type safety = progress + preservation• For proving safety, some other properties such as canonical

forms, uniqueness of type are needed.• Simply typed λ-calculus is non-Turing-complete.

Report

Rep10. State system F, a type system. (Maximal 3 Students)

top related