continuation calculus - thesis bram geron

7/27/2019 Continuation Calculus - Thesis Bram Geron

1/52

Continuation calculus

Masters thesis

Bram Geron

Supervised by Herman Geuvers

Assessment committee:Herman Geuvers, Hans Zantema, Alexander Serebrenik

Final version


2/52

Contents

1 Introduction 3

1.1 Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 The virtues of continuation calculus . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.1 Modeling programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41.2.2 Ease of implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51.2.3 Simplicity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 The calculus 7

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.2 Definition of continuation calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Categorization of terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Reasoning with CC terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 Fresh names . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4.2 Term equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132.4.3 Program substitution and union . . . . . . . . . . . . . . . . . . . . . . . . 14

2.5 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.5.1 Call-by-name and call-by-value functions . . . . . . . . . . . . . . . . . . . 17

2.6 Example: list multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.6.1 Correctness proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Relation to programming languages 22

3.1 ML+ syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.2 Example programs in ML+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243.3 Using data types in ML+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253.4 Reduction semantics of ML+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.5 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.5.1 Inert terms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.5.2 Translation to CC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4 Relation to lambda calculus 32

4.1 Embedding lambda calculus in continuation calculus . . . . . . . . . . . . . . . . . 324.1.1 The subset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334.1.2 CPS transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344.1.3 Supercombinator transformation . . . . . . . . . . . . . . . . . . . . . . . . 374.1.4 Defunctionalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4.2 Embedding continuation calculus in lambda calculus . . . . . . . . . . . . . . . . . 394.2.1 Functionalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404.2.2 Cycle elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5 Related work 42

6 Conclusion and future work 43

1


3/52

A Proofs 47

A.1 General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47A.2 Program substitution and union . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

A.3 Term equivalence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2


4/52

Chapter 1

Introduction

1.1 ForewordThis thesis is about continuation calculusor CC in short, a novel way of formally modeling programs.This calculus was initially developed by the author as a simple and uniform compilation targetfor programs, that could subsequently be executed reasonably efficiently, such that functionallanguages could be readily built on it.

Similar goals are fulfilled by the abstract lambda calculus [ 3], or the more practical spinelesstagless G-machine [24], STG in short, used by the popular Glasgow Haskell Compiler. Continuationcalculus is an attempt to attain the simplicity of lambda calculus in a calculus that is straightforwardto implement, and natively supports continuations. The calculus was originally designed for atoy call-by-value language, but further examination revealed that call-by-name languages are alsomodeled by CC.

The authors research has focused on the definition of CC, how it relates to programminglanguages, and reasoning with CC programs. In this thesis, we try to sketch a complete picture ofwhy CC is useful and how it can be used. Although the broad scope makes it impossible to givein-depth proofs of all intended properties, we do give formal proofs for some specific properties.The research has produced a forthcoming paper in collaboration with the authors supervisorHerman Geuvers. [13] The paper makes up Chapter 2 and Appendix A of this thesis, with onlyminor changes.

Research is still ongoing on the subject of ML+, an exploratory programming language toformalize the interplay of call-by-name and call-by-value. Although the author wants to furtherresearch on this topic in time, and the current text on it is rough, he thinks that its semantics asgiven are already meaningful, and the translation to CC is correct. Thus, ML+ backs the ideathat CC supports modeling mixed call-by-name and call-by-value code, and concretizes how thiscan be done in practice.

This introduction will continue by describing three particular qualities of continuation calculus.Firstly, we explain what functional programs are, the significance of call-by-value and call-by-name,and the added value of continuations. The latter feature is modeled by CC, but not by or STG.Even though continuations are modeled by extensions of lambda calculus, this loses some of itssimplicity. Secondly, we explain why continuation calculus is more straightforward to implementthan lambda calculus. If we are looking for a code representation with the hope of eventuallyexecuting it, it is important that we do not force idiosyncrasies in the representation that causeotherwise-unneeded complexity over the whole chain. Finally, we argue that continuation calculusis a much simpler representation than STG.

Our claim is not that continuation calculus is the best in all three qualities: powerful, close-to-the-metal, and simple. The individual virtues are perhaps much better addressed by satisfiabilitymodulo theories [19], assembly language, and a one instruction set computer [21]. Instead, we

claim is that continuation calculus addresses all three qualities quite well: a sweet spot.

3


5/52

These virtues are expected to make continuation calculus attractive for numerous peoplewho work with languages. Programming language designers should find CC a handy tool toexpress when computations are done, how data is grouped, and how data flows between control

points. Programming language implementors should find it straightforward to make a simpleimplementation of continuation calculus, and hopefully already in a fast one; furthermore, thestructure that CC offers in the form of names with a fixed arity should help implementors tooptimize implementations. Finally, programmers interested in optimizing their code, with a proofthat improved has the same functionality, can be helped using equivalence and theorems on CCwhen the appropriate mapping between CC and programming languages has been deepened.

After this introductory section, we will briefly introduce the types of programming languagesthat we model with CC, and will explain the properties that we destine to find. We continue inChapter 2 with an explanation of the calculus, a formal definition, and some mathematical toolsto work with CC. We also explain a concrete program written in CC by relating it to a versionin a more conventional programming language, and we prove its correctness. In Chapter 3, weshow how programs can systematically be encoded in continuation calculus. For this purpose,we introduce a toy programming language called ML+, which supports all three of call-by-value,call-by-name, and continuations. We show how programs in ML+ can be encoded in continuationcalculus. Finally, we explore the connection between continuation calculus and lambda calculus inChapter 4.

The author wants to remark that there is an online and offline evaluator available throughhttp://bgeron.nl/cc. It has helped the author to correct the bugs in hand-coded CC programs.Hopefully, the evaluator may also aid the intuition of the reader. Some demo programs are included,and all programs in this thesis should be testable.

1.2 The virtues of continuation calculus

1.2.1 Modeling programsContinuation calculus models programs, specifically functional programs. By functional, we meanthat entities passed around from one code block to another do not change.

The dominant programming style in functional programming is to invoke a function, which willreturn a result. This is in contrast to what is sometimes called imperative programming: a style inwhich shared memory locations are written to. Components in such programs often communicateby modifying shared memory. One might say that imperative programs have side effects, andfunctional programs dont.

Return-based programming, often also known as functional programming, enables programmingtechniques that aid modularity [15]. This is a necessary aspect of the long-term quality of software.Another important characteristic of FP is that it is easier to restructure software written by otherpeople, because there can be no hidden interfaces. In effect, each subprogram becomes analyzable

on its own.Because functional languages are so structured, it is feasible to analyze them mathematically.Such analysis can provide certainty that a particular changes in the program do not introducefaulty behavior. Furthermore, it can give programmers a comprehensive mental model of howtheir subprograms can be used. Finally, functional language developers may choose to evolve thelanguage in a manner that retains programmers mental models of the language, aided by thisanalysis. In effect, such analysis yields orthogonal languages.

Call-by-what? Functional languages can broadly be divided in two styles: call-by-need (or lazy)and call-by-value. These styles are distinguished most easily by considering how the computerevaluates the program. In call-by-value, evaluation order always follows the structure of programcode, descending into the functions that it calls. In call-by-need, the program continuously generatesterms that depend on one another, which are only elaborated when it is found essential by thecomputer. So when the computer is computing the sum of two numbers, it finds where either

4
http://bgeron.nl/cchttp://bgeron.nl/cchttp://bgeron.nl/cc


6/52

natural came from, and executes that code before the addition. The origin of the arguments maybe much further away than the calling site.

This difference has broad implications. For one, call-by-need allows one to work with infinite

data structures, if only a finite part of that data is ever required for program execution. Speedis also affected in practice, for two reasons. On the one hand, call-by-need does not computesubresults that are not used. On the other hand, call-by-value has a predictable control flow, whichhelps the CPU to efficiently execute the machine code [20].

Continuation calculus models both call-by-value and call-by-name. The latter is functionallyequivalent to call-by-need (infinite data structures are supported), but there are no facilities toremember results of previously-done computation when they are reusable.

Continuations Another useful feature included in some call-by-value languages, and modeled byus, are continuations. When a program takes a continuation, it reifies what is currently plannedto be the rest of the programs execution. This powerful construct allows the program to go backto decisions made earlier, and change them, or it allows two algorithms to run in an interleaved

fashion, perhaps to use the result of the algorithm that finishes first [14]. At its core, continuationsliberate practical control flow from the syntax tree, so that the program may follow a structurethat is most natural for humans.

Besides in call-by-value languages, continuations have recently been described for a call-by-namelanguage [23]. However, the three features seem not to have been modeled together so far.

Continuations may be used to model a form of exceptions. [17] In this thesis, we use onlyundelimited continuations.

1.2.2 Ease of implementation

Continuation calculus is more straightforward to implement than lambda calculus, for three reasons.The three reasons contribute to the view that continuation calculus is closer to the machine thanlambda calculus.

Firstly, continuation calculus separates code and data, a practice that is commonly calleddefunctionalization. [7] A continuation calculus program is always evaluated by looking up thehead of the current term, and executing the corresponding rule. This rule may be precompiled, asthe set of rules is known at compile time. Such precompilation is also possible in lambda calculus,but is frequently preceded by a defunctionalization. [1]

Besides separating code and data, continuation calculus eliminates the need for contexts.In lambda calculus, it does not suffice to evaluate on the top level: an expression of the form(x.M) t1 tk can only -reduce (x.M) t1 or a term in M. Finding the correct evaluationcontext is unnecessary in continuation calculus.

Lastly, there is only a single reduction order in continuation calculus. The semantics of ahigher-level language are specified in the translation to CC, so that the intermediate CC program isfree of evaluation ambiguities, and can be mixed with the CC translation of programs in higher-level

languages with a different evaluation order. Terms in lambda calculus have an implicitly intended(but hidden) evaluation order. Although the CPS translation [25] allows lambda terms to besimulated in other evaluation orders, the existence of the translation disproves the universality oflambda calculus for the purpose of expressing program meaning.

1.2.3 Simplicity

Lambda and continuation calculus have very few language-specific features. For instance, neitheracknowledges the existence of data constructors and case distinction, both present in the spinelesstagless G-machine, which was mentioned early in the introduction. Such programming languagefeatures can be encoded in the existing facilities: lambda abstraction and application, and CCrules and applications.

This is in stark contrast with the spineless tagless G-machine, which gives special treatmentto data constructors, case distinction, fixed precision integers and operations on them. Such

5


7/52

features are not necessary for Turing completeness, and could theoretically be removed. All featuresof lambda and continuation calculus are crucial to their Turing completeness: the abstraction,application, variable, and -reduction of , and the rules, polyadic names, dot, and reduction ofCC.

1.3 Acknowledgements

The author would like to thank Herman Geuvers for his supervision: without his faith in the project,continuation calculus would have remained mere loose ideas in the authors head. The authorwould also like to thank the reviewers of the forthcoming paper [13] for their helpful comments,together with Alexander Serebrenik and Hans Zantema, and other friends who have given usefulfeedback.

6


8/52

Chapter 2

The calculus

2.1 IntroductionContinuation calculus looks a bit like term rewriting [27] and a bit like -calculus, and it has ideasfrom both. A term in CC is of the shape

n.t1. .tk,

where n is a name and the ti are themselves terms. The dot is a binary operator that associatesto the left. Note that terms do not contain variables. A program P is a list ofprogram rules of theform

n.x1 . .xk u

where the xi are all different variables and u is a term over variables x1 . . . xk. This program ruleis said to define n, and we make sure that in a program P there is at most one definition of n.Here, CC already deviates from term rewriting, where one would have, for example:

Add(0, m) m

Add(S(n), m) S(Add(n, m))

These syntactic case distinctions, or pattern matchings, are not possible in CC.The meaning of the program rule n.x1 . .xk u is that a term n.t1. .tk evaluates to

u[x := t]: the variables x in t are replaced by the respective terms t. A peculiarity of CC is thatone cannot evaluate deep in a term: we do not evaluate inside any of the ti and if we have aterm n.t1. .tm, where m > k, this term does not evaluate. (This will even turn out to be a

meaningless term.)

We exemplify how CC works by explaining the natural numbers: how they are represented inCC and how one can program addition on them. A natural number is either 0, or S(m) for m anatural number. We shall have a name Zero and a name S. The number m will be represented byS.( .(S.Zero) ), with m times S. So the numbers 0 and 3 are represented by the terms Zeroand S.(S.(S.Zero)).

The only way to extract information from a natural m is to transfer control to that natural.Execution should continue in one code when m = 0, and execution should continue in differentcode when m 1. This becomes possible by postulating the following rules for Zero and S:

Zero.z.s z

S.x.z.s s.x

The programmer is now to construct a term z and a term s. We observe that we can separate thecases t = Zero and t = S.x by reducing t.z.s z or t.z.s s.x.

7


9/52

We will now implement call-by-value (CBV) addition in CC on these natural numbers. Theidea of CC is that a function application does not just produce an output value, but passes it to thenext function, the continuation. So we are looking for a term AddCBV that behaves as follows:

AddCBV.m.p.r r.m +p (2.1)

for all m,p,r, where is the multi-step evaluation, and l are the terms that represent a naturalnumber l. Term r indicates where evaluation should continue after the computation of m +p.

Equation (2.1) is the specification of AddCBV. We will use the following algorithm:

0 +p = p

S(m) +p = m + S(p)

To program AddCBV, we have to give a rule of the shape AddCBV.x.y.r t. We need tomake a case distinction on the first argument x. If x = Zero, then the result of the addition is y, sowe pass control to r.y. If x = S.u, then control should eventually transfer to r. (AddCBV.u.(S.y)).Let us write down a first approximation of AddCBV:

AddCBV.x.y.r x.(r.y).t

The term t is yet to be determined. Now control transfers to r.y when x = Zero, or to t.u whenx = S.u. From t.u, control should eventually transfer to AddCBV.u.(S.y).r. Let us write down anaive second approximation of AddCBV, in which we introduce a helper name B.

AddCBV.x.y.r x.(r.y).B

B.u AddCBV.u.(S.y).r

Unfortunately, the second line is not a valid rule: y and r are variables in the right-hand side of B,but do not occur in its left-hand side. We can fix this by replacing B with B.y.r in both rules.

AddCBV.x.y.r x.(r.y).(B.y.r)

B.y.r.u AddCBV.u.(S.y).r

This is a general procedure for representing data types and functions over data in CC. We can nowprove the correctness of AddCBV by showing (simultaneously by induction on m) that

AddCBV.m.p.r r.m +p

B.p.r.m r.m +p + 1

We formally define and characterize continuation calculus in the following sections. In Section 2.5,we define the meaning of , which allows us to give a specification for call-by-name (CBN)

addition, AddCBN:

AddCBN.m.p m +p

This statement means that AddCBN.m.p is equivalent to and compatible with S.( (S.Zero) ),with m + p times S. The precise meaning of this statement will be given in Definitions 32 andRemark 33. Note that while AddCBV.m.p.r r.m +p involved reduction, it involves noreduction to form the term AddCBV.m.p compatible with S.( (S.Zero) ): this compu-tation is delayed until the case distinction between m +p = 0 and m +p 1 is needed, analogouslyto the call-by-name paradigm.

The terms AddCBV and AddCBN are of a different kind. Nonetheless, we will see in Section 2.5.1how call-by-value and call-by-name functions can be used together. We show additional exampleswith FibCBV and FibCBN in Section 2.5.1. Furthermore, we model and prove a program with

call/cc in Section 2.6.

8


10/52

2.2 Definition of continuation calculus

Definition 1 (names). There is an infinite set N of names. Concrete names are typically denoted

as upper-case letters (A, B, . . .), or capitalized words (True, False, And, . . .); we refer to any nameusing n and m.

Interpretation. Names are used by programs to refer to functionality, and will serve the role ofconstructors, function names, as well as labels within a function.

Definition 2 (universe). The set of terms U in continuation calculus is generated by:

U ::= N | U.U

where . (dot) is a binary constructor. The dot is neither associative nor commutative, and thereshall be no overlap between names and dot-applications. We will often use M , N , t , u to refer toterms. If we know that a term is a name, we often use n, m. We sometimes use lower-case wordsthat describe its function, e.g. abort, or letters, e.g. r for a return continuation.

The dot is read left-associative: when we write A.B.C, we mean (A.B).C.

Interpretation. Terms by themselves do not denote any computation, nor do they have any value ofthemselves. We inspect value terms by dotting other terms on them, and observing the reductionbehavior. If for instance b represents a boolean value, then b.t.f reduces to t if b represents true;b.t.f reduces to f if b represents false.

Definition 3 (head, length). All terms have a head, which is defined inductively:

head(n N) = n

head(a.b) = head(a).

The head of a term is always a name.

The length of a term is determined by the number of dots traversed towards the head.

length(n N) = 0

length(a.b) = 1 + length(a).

This definition corresponds to left-associativity: length(n.t1.t2. .tk) = k.

Definition 4 (variables). There is an infinite set V of variables. Terms are not variables, nor isthe result of a dot application ever a variable.

Variables are used in CC rules as formal parameters to refer to terms. We will use lower-caseletters or words, or x,y,z to refer to variables.

Note that we use similar notations for both variables and terms. However, variables exist only

in rules, so we expect no confusion.

Definition 5 (rules). Rules consist of a left-hand and a right-hand side, generated by:

LHS ::= N | LHS.V where every variable occurs at most once

RHS ::= N | V | RHS.RHS

Therefore, any right-hand side without variables is a term in U.A combination of a left-hand and a right-hand side is a rule only when all variables in the

right-hand side also occur in the left-hand side.

Rules ::= LHS RHS where all variables in RHS occur in LHS

A rule is said to define the name in its left-hand side; this name is also called the head. Thelength of a left-hand side is equal to the number of variables in it.

9


11/52

Definition 6 (program). A program is a finite set of rules, where no two rules define the samename. We denote a program by P.

Programs = {P Rules|P is finite and head() is injective on the LHSes in P}

The domain of a program is the set of names defined by its rules.

dom(P) = {head(rule) | rule P}

We will frequently extend programs: an extension of a program P is a program that is asuperset of P.

Definition 7 (evaluation). A term can be evaluated under a program. Evaluation consists of zeroor more sequential steps, which are all deterministic. For some terms and programs, evaluationnever terminates.

We define the evaluation through the partial successor function nextP() : U U. We define

nextP(t) when P defines head(t), and length(t) equals the length of the corresponding left-handside.

nextP(n.t1.t2. .tk) = r[x := t ] when n.x1 .x2. .xk r P

It is allowed that k = 0:

nextP(n) = r when n r P

More informally, we write M P N when nextP(M) = N. The reflexive and transitive closureof P will be denoted P. When M N, then we call N a reduct of M, and M is said tobe defined. When nextP(M) is not defined, we write that M is final. Notation: M P. We alsocombine the notations: ifnextP(M) = N and nextP(N) is undefined, we may write M P N P.

We will often leave the subscript P implicit: M N .In Section 2.3, we divide the final terms in three groups: undefined terms, incomplete terms,and invalid terms. Thus, these are the three cases where nextP(M) is undefined.

Definition 8 (termination). A term M is said to be terminating under a program P, notationMP, when it has a final reduct: N U : M P N P. We often leave the subscript Pimplicit.

2.3 Categorization of terms

A program divides all terms into four disjoint categories: undefined, incomplete, complete, and

invalid. A terms evaluation behavior depends on its category, to which the terms arity is crucial.

Definition 9. The name n has arity k if P contains a rule of the form n.x1 . .xk q.A term t has arity k i if it is of the form n.q1. .qi, where n has arity k (k i).

Definition 10. Term t is defined in P ifhead(t) dom(P), otherwise we say that t is undefined.Given a t that is defined, we say that

t is complete if the arity of t is 0

t is incomplete if the arity of t is j > 0

t is invalid if is has no arity (that is, t is of the form n.q1. .qi, where n has arity k < i)

The four categories have distinct characteristics.

10


12/52

Undefined terms. Term M is undefined iff M.N is undefined. This does not depend on term N.Extension of the program causes undefined terms to remain undefined or become incomplete,complete, or invalid.

Interpretation. Because variables are not part of a term in continuation calculus, we useundefined names instead for similar purposes.1 This means that all CC terms are closed inthe lambda calculus sense.

The remaining three categories contain defined terms: terms with a head dom(P). Extensionof the program does not change the category of defined terms.

Incomplete terms. If M is incomplete, then M.N can be incomplete or complete.

Interpretation. There are four important classes of incomplete terms.

Data terms (see Section 2.5). If d represents ck(v1, , vnk) of a data type D with mconstructors, then t1 . . . tm U : d.t tk.v. Examples:

t1, t2 U : Zero.t1 .t2 t1 Zero represents 0

t1, t2 U : S.(S.(S.Zero)).t1.t2 t2.(S.(S.Zero)) S.(S.(S.Zero)) represents S(S(S(0)))

Or, using more mnemonic variables z and s:

z, s U : Zero.z.s z Zero represents 0

z, s U : S.(S.(S.Zero)).z.s s.(S.(S.Zero)) S.(S.(S.Zero)) represents S(S(S(0)))

Call-by-name function terms. These are terms f such that f.v1. .vk is a data term D for all v in the appropriate domain. Example using Figure 2.1:

z, s U : AddCBN.Zero.Zero.z.s z

z, s U : AddCBN.(S.Zero).(S.(S.Zero)).z.s s.(AddCBN.Zero.(S.(S.Zero)))

Recall that AddCBN.(S.Zero).(S.(S.Zero)) is a data term that represents 3. The secondreduction shows that 1+CBN2 = S(x), for some x represented by AddCBN.Zero.(S.(S.Zero)).

Call-by-value function terms. These are terms f of arity n + 1 such that for all v in acertain domain, r U : f.v1. .vn.r r.t with data term t depending only on v, noton r. Example:

r U : AddCBV.(S.Zero).(S.(S.Zero)).r r.(S.(S.(S.Zero)))

Return continuations. These represent the state of the program, parameterized oversome values. Imagine a C program fragment return abs(2 - ?);. If we were to

resume execution from such fragment, then the program would run to completion, but itis necessary to first fill in the question mark. Ifr represents the above program fragment,then r.3 represents the completed fragment return abs(2 - 3);.

If a return continuation has arity n, then it corresponds to a program fragment with nquestion marks.

Invalid terms. All invalid terms will be considered equivalent. If M is invalid, then M.N is alsoinvalid.

Complete terms. This is the set of terms that have a successor. If M is complete, then M.N isinvalid.

1This is substantiated by Theorem 12.

11


13/52

Common definitions

Zero.z.s z

S.m.z.s s.m

Nil.ifempty.iflist ifempty

Cons.n.l.ifempty.iflist iflist.n.l

Call-by-value functions

AddCBV.x.y.r x.(r.y).(AddCBV.y.r)

AddCBV.y.r.x AddCBV.x.(S.y).r

FibCBV.x.r x.(r.Zero).(FibCBV1 .r)

FibCBV1 .r.y y.(r.(S.Zero)).(FibCBV2.r.y)

FibCBV2.r.y.y FibCBV.y.(FibCBV3.r.y

)

FibCBV3.r.y.fiby FibCBV.y

.(FibCBV4 .r.fiby)

FibCBV4 .r.fiby.fiby AddCBV.fiby.fiby .r

Call-by-name functions

AddCBN.x.y.z.s x.(y.z.s).(AddCBN.y.s)

AddCBN.y.s.x s.(AddCBN.x.y)

FibCBN.x.z.s x.z.(FibCBN1 .z.s)

FibCBN1 .z.s.y y.(s.Zero).(FibCBN2.z.s.y)

FibCBN2.z.s.y.y AddCBN.(FibCBN.y).(FibCBN.y).z.s

Figure 2.1: Continuation calculus representations of + and fib. The functions are applied in adifferent way, as shown in Figure 2.2. This incompatibility is already indicated by the differentarity: arity(AddCBV) = 3 = arity(AddCBN) = 4, and arity(FibCBV) = 2 = arity(FibCBN) = 3.Figure 2.2 shows how to use the four functions.

12


14/52

2.4 Reasoning with CC terms

This section sketches the nature of continuation calculus through theorems. All proofs are included

in the appendix.

2.4.1 Fresh names

Definition 11. When a name fr does not occur in the program under consideration, then wecall fr a fresh name. Furthermore, all fresh names that we assume within theorems, lemmas, andpropositions are understood to be different. When we say fr is fresh for some objects, then it isadditionally required that fr is not mentioned in those objects.

We can always assume another fresh name, because programs are finite and there are infinitelymany names.

Interpretation. Fresh names allow us to reason on arbitrary terms, much like free variables inlambda calculus.

Theorem 12. Let M, N be terms, and let name fr be fresh. The following bi-implications hold:

M N t U : M[fr := t] N[fr := t]

M t U : M[fr := t]

Lemma 13 (determinism). Let M, t, u be terms, and let m, n be undefined names in P. IfMP m.t1. .tk and MP n.u1. .ul, then m.t1. .tk = n.u1. .ul.

Remark 14. If m or n is defined, this may not hold. For instance, in the program A B;B C, we have A B and A C, yet B = C.

2.4.2 Term equivalenceBesides syntactic equality (=), we introduce two equivalences on terms: common reduct (=P) andobservational equivalence (P).

Definition 15. Terms M, N have a common reduct if M t N for some term t. Notation:M =P N.

Proposition 16. Suppose M =P N . Then M N.

Common reduct is a strong equivalence, comparable to -conversion for lambda calculus. TermsM = N can only have a common reduct if at least one of them is complete. This makes pure=P unsuitable for relating data or function terms, which are incomplete. In fact, =P is not acongruence.

To remedy this, we define an observational equivalence in terms of termination.

Definition 17. Terms M and N are observationally equivalent under a program P, notationM P N, when for all extension programs P P and terms X:

X.MP X.NP

We may write M N if the program is implicit.

Examples: AddCBV.m.0 AddCBV.0.m and 0 True, but 0 1; see Sec-tion 2.5.

Lemma 18. is a congruence. In other words, if M M and N N, then M.N M.N.

13


15/52

Characterization The reduction behavior of complete terms divides them in three classes.Observational equivalence distinguishes the classes.

Nontermination. When M is nonterminating and the program is extended, M remainsnonterminating.

If the reduction path of M is finite, we call it terminating, and we may write M. This isshorthand for N U : M N .

Proper reduction to an incomplete or invalid term. All such M are observationally equivalentto an invalid term. When the program is extended, such terms remain in their executionclass.

Proper reduction to an undefined term. Observational equivalence distinguishes terms M, Nif the head of their final term is different. Therefore, there are infinitely many subclasses.

When the program is extended, the final term may become defined. This can cause such Mto fall in a different class.

The following proposition and theorem show that distinguishes three equivalence classes: ifM N, then M and N are in the same class.

Proposition 19. If M N, then M N.

Theorem 20. Let M N and M fr.t1. .tk with fr / dom(P). Then N fr.u1. .uk for some u.

Retrieving observational equivalence Complete terms with a common reduct are observa-tionally equal. If M, N are incomplete, but they have common reducts when extended with terms,then also M N.

Theorem 21. Let M, N be terms with arity k. If M.t1. .tk =P N.t1. .tk for all t, thenM N.

Corollary 22. Suppose M =P N andarity(M) = arity(N) = 0. Then M N.

Remark 23. M N does not always imply M N if arity(N) > 0. For instance, take thefollowing program:

Goto.x x

Omega.x x.x

Then Goto.Omega Omega, an incomplete term. We cannot fix Goto.Omega by appendinganother term: Goto.Omega.Omega is invalid. Name Goto is defined for one operand term,

and the superfluous Omega term cannot be memorized as with lambda calculus. On the otherhand, Omega.Omega Omega.Omega is nonterminating. Hence, Goto.Omega Omega butGoto.Omega Omega.

Note that Goto.Omega Omega is only possible because arity(Goto.Omega) = arity(Omega).

2.4.3 Program substitution and union

Definition 24 (fresh substitution). Let n1 . . . nk be names, and m1 . . . mk be fresh for M, alldifferent. Then M[n := m] is equal to M where all occurrences ofn are simultaneously replacedby m, respectively. The fresh substitution P[n := m] replaces all n by m in both left and righthand sides of the rules of P.

We can combine two programs by applying a fresh substitution to one of them, and taking the

union. As the following theorems shows, this preserves most interesting properties.

14


16/52

Theorem 25. Suppose that P P is an extension program, and M, N are terms. Then the lefthand equations hold. Let denote a fresh substitution [n := m]. Then the right hand equationshold.

M P N = M P N M P N M P N MP N = MP N MP N M P N

M P = M P M P M PM =P N = M =P N M =P N M =P N M P N = M P N M P N M P N

Remark 26. Names n are not mentioned in M and P , so we can apply Theorem 25 with 1 onM and P .

Theorem 27. Suppose that P extends P, butdom(P \ P) are not mentioned in M, N, or P.Then M P N M P N.

2.5 Data

In this section, we show how to program some standard data in continuation calculus. We first givea canonical representation of data as CC terms. We then give essential semantic characteristics,and show that other terms have those characteristics as well. Observational equivalence guaranteesthat termination of the whole program is only dependent on those characteristics. In fact, it willprove possible to implement call-by-name values, which delay computation until it is needed, byrelying on those characteristics.

Standard representation of data In Section 2.1, we postulated terms for natural numbersin continuation calculus. We will now give this standard representation formally, as well as therepresentation of booleans and natural lists.

Definition 28. For a mathematical object o, we define a standard representation o of that objectas a CC term, which we call a data term. We postulate that the rules in the middle column areincluded in programs that use the corresponding terms.

True = True True .t.f t

booleansFalse = False False .t.f f

0 = Zero Zero.z.s z

naturalsm + 1 = S.m S.x.z.s s.x

[] = Nil Nil .e.c e

lists of naturalsm : l = Cons.m.l Cons.x.xs.e.c c.x.xs

Theorem 29. True False.

Proof. Observe that for all t, f, True.t.f t and False.t.f f. Take two fresh names t andf. Contraposition of Theorem 20 proves True.t.f False.t.f. Because is a congruence withrespect to dot, we can conclude True False.

Similar results hold for N and ListN, but we do not provide a proof here.

A broader definition The behavioral essence of these data terms is that they take a continuationfor each constructor, and they continue execution in the respective continuation, augmented withthe constructor arguments. For instance, 0.z.s z and n + 1.z.s s.n. We can capture this

15


17/52

essence in the following term sets; N and ListN are the smallest sets satisfying the followingequalities.

B = {M U | t, f U : M.t.f t M.t.f f}N = {M U | (z, s U : M.z.s z)

x N z, s U : M.z.s s.x}

ListN = {M U | (e, c U : M.e.c e)

x N, xs ListN e, c U : M.e.c c.x.xs}

Remark 30. These sets are dependent on the program. The sets are monotone with respectto program extension: if M B, N, or ListN for a program, then M is also in thecorresponding set for any extension program.

The sets include other terms besides True, False, n, and l. Consider the followingprogram fragment, which implements the operator on natural numbers.

Leq.x.y.t.f x.t.(Leq.y.t.f)Leq.y.t.f.x Leq.y.x.f.t

Given naturals m, p and this program fragment, Leq.m.p B. Even more, Leq.m.p m p. In general, it follows from Theorem 21 that all M B are observationally equivalent toTrue or False. The appendix contains a proof of the analogous statement for N:

Proposition 31. All terms inN are observationally equivalent to k for some k.

For further reasoning, it is useful to split up N in parts as follows.

Definition 32. For a natural number k, the set k is defined as {M N|M k}. We defineb and l analogously for booleans b and lists of naturals l.

With this definition, we may say Leq.3.4 True. In fact, ifa 3 and b 4, thenLeq.a.b True.2 To support this pattern of reasoning, we allow to lift , denoting a term. Theresulting statements are implicitly quantified universally and existentially, and are usable in proofchains.

Remark 33. For data terms, we would like to reason and compute with equivalence classes ofrepresentations, k, instead of with the representations themselves, k. Of course, a CC programwill always compute with a term (and not with an equivalence class of terms), but we would likethis computation to only depend on the characterization of the equivalence class.

For example, we want to compute a CBN addition function AddCBN, such that for all m, p N,t mu p : AddCBN.t.u m +p. As a specification, we want to summarize this as:

AddCBN.m.p m +p

We will also summarize a statement of the form t1 m t2 m t3 l : A.t1 B.t2.t3 with the shorthand A.m B.m.l. If we know A.m B.m.l and B.m.lC.m, then we may logically conclude

t1 m t2 m t3 l t4 m : A.t1 B.t2.t3 C.t4 ,

which we will summarize as A.m B.m.l C.m. Analogous statements of this form,and longer series, will be summarized in a similar way. In particular, it will suit us to also use and =P in longer derivations.

2To see this, observe Leq.a.b Leq.3.4 True by congruence, then use Theorems 20 and 12.

16


18/52

Example: delayed addition We will program a different addition on natural numbers: onethat delays work as long as possible, like in call-by-name programming languages. We use thefollowing algorithm, for natural numbers m, p:

0 +p = p

S(m) +p = S(m +p)

The resulting name AddCBN will be a call-by-name function, with specification AddCBN.m.p m+p, so we have to build a rule for AddCBN. Because AddCBN.m.p N, arity(AddCBN) =4. We reduce the specification with a case distinction on the first argument.

AddCBN.0.p.z.s =P p.z.s, (AddCBN.0.p has the same specification as p)

AddCBN.S(m).p.z.s s.m +p

We must make the case distinction by using the specified behavior of the first argument. Thissuggests a rule of the form AddCBN.x.y.z.s x.(y.z.s).(s.(AddCBN.x.y)). It almost works:

AddCBN.0.p.z.s

p.z.sAddCBN.S(m).p.z.s s.(AddCBN.x.p).m

However, variable x is not in the left-hand side, so this is not a valid rule. Furthermore, ifx = S(x), then x would be erroneously appended to s.(AddCBN.x.y). We fix AddCBN with ahelper name AddCBN, with specification AddCBN.p.s.m s.m +p.

AddCBN.x.y.z.s x.(y.z.s).(AddCBN.y.s))

AddCBN.y.s.x s.(AddCBN.x.y)

This version conforms to the specification.

AddCBN.0.p.z.s p.z.s = p

AddCBN.S(m).p.z.s AddCBN.p.s.m

s.(AddCBN.m.p) = s.m +p

2.5.1 Call-by-name and call-by-value functions

We regard two kinds of functions. We call them call-by-name and call-by-value, by analogy with theevaluation strategies for lambda calculus. Figure 2.1 defines a CBN and CBV version of additionon naturals and the Fibonacci function. Figure 2.2 shows how to use them. It also illustrates thatthe CBV function performs work eagerly, while the CBN function delays work until it is needed:hence the analogy.

Call-by-name functions are terms f such that f.v1. .vk is a data term for all v in a certaindomain. Example specifications for such f:

AddCBN.m.p m +pFibCBN.m.p fib(m)

Call-by-value functions are terms f of arity n + 1 such that for all v in a certain domain,r : f.v1. .vn.r r.t with data term t depending only on v, not on r. Example specifica-tions for such f:

r : AddCBV.m.p.r r.m +p

r : FibCBV.m.r r.fib(m)

The output of FibCBV is always a standard representation. Because our implementation ofAddCBV does not inspect the second argument, its output may not be a standard integer.An example of this is shown in Figure 2.2.

We leave formal proofs of the specifications for future work.

17


19/52

Call-by-value fib(7) Call-by-name fib(7)

To apply f to x, evaluate f.x.r r.y for somer. Then y is the result.

To apply f to x, write f.x. This is directly adata term, no reduction happens.

The result offib(7) is 13, obtained in 362reduction steps:FibCBV.7.fr fr.13

By the specification of FibCBN, we knowFibCBN.7 13. No reduction is involved.

Both 13 and FibCBN.7 can be used in other functions, like +. Because they are observationallyequivalent, they can be substituted for each other in a term. That does not affect termination, or

the head of the final term if that is undefined (Theorem 20). However, substituting 13 forFibCBN.7 may make the evaluation shorter.

13 +CBV0 is obtained in 41 steps: FibCBN.7 +CBV0 is obtained in 304 steps:AddCBV.13.0.fr fr.13 in 41 steps AddCBV.(FibCBN.7).0.fr

fr.13 in 304 steps (263 more)

Our implementation of AddCBV does not examine the right argument, as the converse addition shows.

AddCBV.0.13.fr fr.13 in 2 steps AddCBV.0.(FibCBN.7).frfr.(FibCBN.7) in 2 steps

Figure 2.2: Calculating fib(7), fib(7) + 0, and 0 + fib(7) using call-by-value and call-by-name.Effectively, FibCBN delays computation until it is needed. A natural number n stands forS.( .(S n times

.Zero) ).

2.6 Example: list multiplication

To illustrate how control is fundamental to continuation calculus, we give an example program thatmultiplies a list of natural numbers, and show how an escape from a loop can be modeled withouta special operator in the natural CC representation. We use an ML-like programming language forthis example, and show the corresponding call-by-value program for CC.

The naive way to compute the product of a list is as follows:

let rec listmult1 l = match l with| [] 1| (x : xs) x listmult1 xs

ListMult.l.r l.(r.(S.Zero)).(C.r)

C.r.x.xs ListMult.xs.(PostMult.x.r)

PostMult.x.r.y Mult.x.y.r

Note that if l contains a zero, then the result is always zero. One might wish for a more efficientversion that skips all numbers after zero.

let rec listmult2 l = match l with| [] 1| (x : xs) match x with

| 0 0| x + 1 x listmult2 l

ListMult.l.r l.(r.(S.Zero)).(B.r)

B.r.x.xs x.(r.Zero).(C.r.x.xs)

C.r.x.xs.x ListMult.xs.(PostMult.x.r)

PostMult.x.r.y Mult.x.y.r

However, listmult2 is not so efficient either: if the list is of the form [x1 + 1, , xk + 1, 0], then weonly avoid multiplying 0 listmult2 []. The other multiplications are all of the form n 0 = 0. Wealso want to avoid execution of those surrounding multiplications. We can do so if we extend MLwith the call/cc operator, which creates alternative exit points that are invokable as a function.

18


20/52


21/52

Proof. We use induction on l, and make a three-way case distinction.

Case 1. Base case: l = []. Then:

A.[].r.r0 []. (r. (S.Zero)) .(B.r.r0) by definition r. (S.Zero) by definition of []= r.product [] S.Zero 1 = product []

Case 2. l = (0 : l). Then:

A.0 : l.r.r0 0 : l. (r. (S.Zero)) .(B.r.r0) by definition of A B.r.r0.0.l by definition of 0 : l 0.r0.(C.r.r0.0.l) by definition of B r0 by definition of 0

=P r.Zero by assumption= r.product (0 : l) Zero 0 = product (0 : l)

Case 3. l = (m + 1 : l). Then:

A.m + 1 : l.r.r0 m + 1 : l. (r. (S.Zero)) .(B.r.r0) by definition of A B.r.r0.m + 1 .l by definition of m + 1 : l m + 1 .r0.(C.r.r0.m + 1 .l) by definition of B C.r.r0.m + 1 .l.m by definition of m + 1 A.l. (PostMult.m + 1 .r) .r0 by definition of C=P PostMult.m + 1 .r.product l by induction if r0 =P PostMult.m + 1.r.0 Mult.m + 1 .product l.r by definition of Postmult

r.(m + 1) product l

spec Mult= r.product (m + 1 : l) mathematics

This chain proves that A.m + 1 : l.r.r0 =P r.product (m + 1 : l).

The third case requires Lemma 35, which is proved below. This completes the induction, yielding:

A.l.r.r0 =P r.product l for all l ListN, r U .

Lemma 35. Letx U, l ListN, r, r0 U and r.0 =P r0. Then PostMult.x.r.0 =P r0.

Proof. By the following chain.

PostMult.x.r.0

Mult.x.0.r by definition of PostMult 0. (r.Zero) . (PostMult.x. (PostAdd.x.r)) by definition of Mult r.Zero = r.0 by definition of 0

=P r0 by assumption

Theorem 36. The specification of ListMult is satisfied. That is: assume l ListN, r U. ThenListMult.l.r r.product l.

Proof. We fill in r0 = r.Zero in the specification ofA; then r.Zero =P r.0 is satisfied by definitionof 0.

A.l.r. (r.Zero) =P r.product l for all l ListN, r U

If we temporarily take r to be a fresh name, then we can change =P into with Proposition 16.

A.l.r. (r.Zero) r.product l for all l ListN

20


22/52

We can generalize this again with Theorem 12:

A.l.r. (r.Zero) r.product l for all l ListN, r U

Now our main correctness result follows rather straightforwardly.

ListMult.l.r A.l.r. (r.Zero) by definition r.product l as just shown

21


23/52

Chapter 3

Relation to programming languages

In this section, we argue the connection between continuation calculus and various programminglanguages: call-by-value languages, call-by-name languages, languages with continuations, andlanguages that support any combination of the three. We do this by introducing a new programminglanguage, called ML+, that supports a mix of call-by-value and call-by-name, and continuations.There are separate expressions to construct a call-by-name function (bynamen [x1, , xk] asif e), a call-by-value function (byval [x1, , xk](y1, yk) return e), and application of either:e[e1, ek] respectively e[e1, ek](e1, , e

k).This section begins with an explanation of the core syntax, and some syntactic sugar to aid

writing programs in ML+. It continues with some example ML+ programs in Section 3.2. Thereduction of these programs is formalized with SOS semantics for ML+ in Section 3.4. In Section 3.5,the meaning is defined again: now implicitly, by a translation to continuation calculus.

One may note that the ML+ examples in Section 3.2 share function names with examples inChapter 2. This is not a coincidence: we have tried to obtain the CC examples we have hand-coded

in Chapter 2, but now starting from a more intuitive source language. A back-of-the-envelopetranslation of the ML+ examples showed that they translated to the CC programs in Chapter 2,modulo some simple optimizations. We leave a more in-depth account of these examples as well asthe development of these optimizations for future research.

This section contains preliminary work, but the author opines that it is informative enough toinclude in this thesis. The reader is warned that this section may be slightly rough and incomplete.

22


24/52

3.1 ML+ syntax

e ::= Cki/n (constructor)

| x (bound variable)

| f (global function)

| e(e1, , ek) (application to k arguments,

if the function has arity k)

| e[e1, , ek] (augmentation with k arguments,


| l1, ,lne1, ,ene (deconstructor for type of n cases)

| let x = e in e (local binding)

| let x = cc in e (get current continuation)

| byval [x1, , xk](y1, , yk

) return e (inline CBV function)| bynamen [x1, , xk] asif e (inline CBN function for type of n cases)

Syntactic sugar:

| if e then et else ef (pattern-match on booleans)

| match e with (pattern-match on naturals)

| 0 e0

| 1 + y eS

| match e with (pattern-match on lists)

| [] eNil

| y :: ys eCons| catch x in e (catch static exception)

| throw eexc in eval (raise static exception)

Def ::= f = byval [x1, , xk](y1, , yk) return e (CBV function definition)

| f = bynamen [x1, , xk] asif e (CBN function definition)

New in this syntax is the lambdapi operator for data deconstruction, written as the superpo-sition of a lambda and a pi symbol: . Expression l1, ,lne1, ,ene is evaluated as follows: first, e isevaluated to a value of a type of n cases. If the value is of a type of fewer or more cases, evaluationhalts. A case distinction is made on this resulting value, to find that the value corresponds toCki/n[v1, , vk]: the ith data constructor with arity k for a type of n cases, with arguments v. If

k = li, the expression reduces to ei[](v1, , vk), else evaluation halts.The combination of lambda and pi was chosen because the operator customarily distinguishes

values of a sum type, and the operator customarily projects values of a product type.This data construction/deconstruction mechanism is further illustrated in Section 3.3.

Syntactic sugar We define if, match, catch, and throw to be equivalent to other syntax as follows.In effect, the four operators are shorthand notation.

23


25/52

if e then et else efdef= 0,0

byval []()return et,byval []()return efe

match e with| 0 e0| 1 + y eS

def= 0,1byval []()return e0,byval [](y)return eS

e

match e with| [] eNil| y :: ys eCons

def= 0,2

byval []()return eNil,byval [](y,ys)return eConse

catch x in edef= let x = cc in e

throw eexc in evaldef= eexc[eval]

3.2 Example programs in ML+

These are the ML+ programs that we demo in the paper. Their translation have seemed identicalor almost identical to the CC programs in the paper, by an informal execution of the translationfrom Section 3.5.

AddCBV = byval [x, y]() return match x with| 0 y| 1 + x AddCBV[x, S[y]]

which is unsugared to:

AddCBV = byval [x, y]() return 0,1byval []() return y,byval [](x) return AddCBV[x, C12/2[y])]

AddCBN = byname2 [x, y] asif match x with| 0 y| x + 1 S[AddCBN[x, y]]

which is unsugared to:

AddCBN = byname2 [x, y] asif 0,1

byval []() return y,byval [](x) return C12/2[AddCBN[x

, y]]

FibCBV = byval [x]() return match x with

| 0 0| 1 + y match y with

| 0 1| 1 + y AddCBV

FibCBV[y](), FibCBV[y]()

FibCBN = byname2 [x] asif match x with| 0 0| 1 + y match y with

| 0 1| 1 + y AddCBN

FibCBN[y], FibCBN[y]

24


26/52

ListMult = byval [l]() return

let return = cc in A[l](return[0])A = byval [l](abort) return match l with

| [] 1| x :: xs match x with

| 0 abort| 1 + x Mult[x, A[xs](abort)]()

3.3 Using data types in ML+

Every data type in ML+ is a sum-of-product type.

T = (T11 T k11 ) + + (T

1n T

knn )

The k-ary constructor of case i, out of n cases, is denoted

Cki/n

such that elements of T are denoted

Cki/n[v1, , vk].

Deconstructing such elements happens with the lambdapi operator, written as the superposi-tion of a lambda and a pi symbol: . For each case i of type T, we must supply a deconstructorfi; then

l1, ,lnf1, ,fnCki/n[v1, , vk] = fi(v1, , vk)

Example: empty type The empty type has zero cases: the empty sum. Although it has noconstructors, it is possible to write a call-by-name function that returns an empty type. Such afunction must diverge (loop infinitely or throw an exception). Hence, we can use e to model

control flow to e from which is not returned.

Example: unit type The unit type 1 has one case: the empty product. It has constructor C01/1,

and its only value is C01/1[]. It can be trivially deconstructed with

0f1C01/1 = f1()

Example: booleans The set of booleans is the sum of two empty products: B = 1 + 1. Itsconstructors are C01/2 and C

02/2, and its values are True = C

01/2[] and False = C

02/2[].

We can implement the ternary operator as follows:

if b then f1() else f2() 0,0f1,f2

b

The rules allow to deconstruct that to

0,0f1,f2True = f1()

0,0f1,f2False = f2()

We write B = 1 + 1 instead ofB = 1 + 1 to avoid the implication that the booleans are asum-of-product-of-sum-of-product type (namely unary products of the unit type 1).

25


27/52

Example: natural numbers The naturals can be typed N = 1 + N: the sum of an empty anda unary product. Its constructors are therefore C01/2 and C

12/2, and its values Zero = C

01/2[] and

S[n] = C12/2

[n] (for n a natural number). We can implement a basic pattern matching as follows:

case n of| 0 f1()| 1 + n f2(n)

0,1f1,f2n

The rules allow to deconstruct that

0,1f1,f2Zero = f1()

0,1f1,f2S[n] = f2(n

)

Example: lists of natural numbers Cons-lists can be typed ListN = 1 + N ListN: the sumof an empty and a binary product. Its constructors are therefore C01/2 and C

22/2, and its values

Nil = C01/2[] and Cons[n, l] = C21/2[n, l] (for n N, l ListN). We can implement a basic patternmatching as follows:

case l of| Nil f1()| Cons[n, l] f2(n, l)

0,2f1,f2 l

The rules allow to deconstruct that

0,2f1,f2Nil = f1()

0,2f1,f2Cons[n, l] = f2(n, l)

Remark 37. True = Zero = Nil, using these encodings. This applies also for the data representationthat we have given in Section 2.5.

3.4 Reduction semantics of ML+

For the semantics, it is necessary to define what is a value in ML+.

v ::= Cki/n (constructor)

| x (bound variable)

| f (global function)

| v[v1, , vk] (augmentation with k arguments,


| byval [x1, , xk](y1, , yk) return e (inline CBV function)

| bynamen [x1, , xk] asif e (inline CBN function for type of n cases)

Values in ML+ dont reduce. All the expressions that are not values, are nonvalues, and theyare reducible. Reduction happens within a context C{}: an expression with a hole; this hole isin the spot where reduction happens. The set Context is generated as follows:

26


28/52

C ::={ } (empty context)

| C[e1, , ek] (augmentation, 0 k)| v [v1, , vj, C , ej+2, , ek] (augmentation, 0 j < k)

| C(e1, , ek) (application, 0 k)

| v(v1, , vj , C , ej+2, , ek) (application, 0 j < k)

| l1, ,lne1, ,enC (deconstructor, 0 n)

| let C in Expr (local binding)

and, of course, the domain-specific syntactic sugar.

Reduction will happen according to our definitions , which is a set of definitions of the form

Def ::= f = byval [x1, , xk](y1, , yk) return e (CBV function definition)

| f = bynamen [x1, , xk] asif e (CBN function definition),

as defined before. The same variable f shall not appear more than once on the left-hand side of anequals sign. Right-hand sides may refer to any function symbol, even their own function symbol.

Using this, we can define our reduction semantics. We shall let e, ei denote expressions, C{} acontext, and v, vi values.

f = byval [x1, , xk](y1, , yk) return e

C{f} C{byval [x1, , x

k](y

1, , y

k) return [x/x, y/y]}

where x, y are fresh variables

f = bynamen [x1, , xk] asif e

C{f} C{bynamen [x

1, , x

k] asif e[x

/x]}

where x are fresh variables

C{v[v1, , vj ][vj+1, , vk]} C{v[v1, , vk]}

C{let x = v in e} C{e[v/x]}

C{(byval [x1, , xk](y1, , yk) return e)[v1, , vk](u1, , uk} C{e[v/x, u/y]}

Cl1, ,lne1, ,en ((bynamen [x1, , xj] asif e) [v1, , vj]) Cl1, ,lne1, ,ene[v/x]

C

l1, ,lne1, ,enClii/n [v1, , vli ]

C{ei[](v1, , vli)}

C{} = { }

C{e} e

(e can never return)

C{let x = cc in e} C{let x = (byname0 [y] asif C{y}) in e}

(bind continuation)

Note that we dont have a reduction rule for variables x, because we substitute in their valuesat application.

27


29/52

3.5 Translation

Our translation will consist of a number of steps. Firstly, we identify a set of inert ML+ terms I,

that have a direct translation into CC terms, without CC rules. Secondly, we will identify a set ofstandard ML+ terms, that include the inert terms, but whose translation adds 1 rule. Thirdly,we will explain how we can transform the rest of ML+ into such standard ML+ terms.

3.5.1 Inert terms

We identify a set I of the inert ML+ expressions. All such ML+ expressions are guaranteed toterminate. The CC translation e of an expression e I represents the ML+ value e, and theonly rules involved are the standard data constructors. Set I is generated as follows.

I ::= Cki/n (general constructor)

|x

(local variable)| f (defined function)

| I[I, ,I] (augmentation)

| byval [x1, , xk](y1, , yk) return Expr (inline CBV function)

| bynamen [x1, , xk] asif Expr (inline CBN function)

Syntactic sugar:

| true | false (boolean constructor)

| 0 | 1 +I (number constructor)

| [] | I :: I (list constructor)

Remark 38. The function : I CC defined here is an extension of the function defined in

28


30/52

our paper to appear in proceedings of COS 2013.

x = x where x is a CC variable corresponding to ML+ variable x

f = F where F is a CC name corresponding to f

and the program shall include f , defined later

Cki/n = Ci,n,k and the program shall include

Ci,n,k.x1 . .xk.f1 . .fn fi.x1 . .xk

e[e1, , ek] = e.e1. .ek

t = F.z1. .zj if t = byval [x1, , xk](y1, , yk) return t,

z = fv(t), F is a fresh CC name

and the program shall include

F.z1 . .zj.x1 . .xk.r.y1 . .yk t

t = F.z1. .zj if t = bynamen [x1, , xk] asif t,

z = fv(t), F is a fresh CC name

and the program shall include

F.z1 . .zj.x1 . .xk.r1 . .rn t

n

Syntactic sugar: when the program includes

true = True True .t.f t

false = False False .t.f f

0 = Zero Zero.z.s z

1 + e = S.e Succ.m.z.s s.m

[] = Nil Nil .e.c e

e1 :: e2 = Cons.e1.e2 Cons.x.xs.e.c c.x.xs

3.5.2 Translation to CC

Each ML+ term lives either in a call-by-value context, in case the innermost enclosing binderis byval, or a call-by-name context, in case the innermost enclosing binder is byname. For CBNcontexts, we furthermore specify the binder arity k. We give the translation t in CC for ML+terms in CBV context, and the translation tn for terms in n-ary CBN context. This is analogousto Plotkin [25], where M stands for the CPS translation of a CBV term, and M stands for theCPS translation of a CBN term.

The free variables of t will consist of the return continuation variable r, in addition to thefree variables of the source term t. The free variables of tn will consist of the case continuation

variables r1, , rn, in addition to the free variables of the source term t.Correct execution of the CC term s will require that rules s are included in the program.

Similarly, correct execution of sn depends on rules sn.We will give a direct translation from a lot of ML+ terms t. For other terms, we indicate that t

is equivalent to a different ML+ term t by writing t t. In such cases, we define the translationst,tn as the translation of the rewritten term. Note that we only have to execute these rewritingson the top level.

In the following translation scheme, assume that s, sj , t , tj are ML+ terms, and i, ij I. Withn, nj we indicate terms that are not inert terms.

29


31/52

Call-by-value

Standard case:

s = i s = r.i

s =

s = i(i1, , ik) s = i.i1. .ik.r

s =

s = l1, ,lki1, ,ik i s = i.(i1.r). .(ik.r)

s =

s = let x = n in t s = n[(F.y1. .yj .r) /r] for F free, y = fv(t) \ {x}

s = n v {F.y.r.x n}

s = let x = cc in t s = t[r/x]

s = t

For all t t below, we define t = t and t = t.

let x = i(i1, , ik) in t t[i(i1, , ik)/x]

n(t1, , tk) let x = n in x(t1, , tk)

i(i1, , ij , n , tj+2, , tk) let x = n in i(i1, , ij , x , tj+2, , tk)

n[t1, , tk] let x = n in x[t1, , tk]

i[i1, , ij , n , tj+2, , tk] let x = n in i[i1, , ij , x , tj+2, , tk]

Call-by-name

Standard patterns:

s = i sn = i.r1. .rn

sn =

s = let x = i(i1, , ik) in e sn = i.i1. .ik.(g.r1. .rn.y) for G free, y = fv(e) \ {x}

sn = {G.r1 . .rn.y.x en}

s = l1, ,lmi1, ,imi sn =

i.(i1.r). .(ik.r)

[Invn.r1. .rn/r]

sn = {Invn.r1 . .rn.x x.r1 . .rn}

s = let x = cc in e sn = en[(Invn.r1. .rk) /x]

sn = en {Invn.r1 . .rn.y y.r1 . .rn}

For all t t below, we define tn = tn and tn = tn.

let x = i in t t[i/x]

t(t1, , tk) let x = t(t1, , tk) in x

t[t1, , tk] let x = t[t1, , tk] in x unless t and all ti are inert

let x = n(t1, , tk) in t let y = n in let x = y(t1, , tk) in t

let x = i(i1, , ij, n , tj+2, , tk) in t let y = n in let x = i(i1, , ij, y , tj+2, , tk) in t

n[t1, , tk] let x = n in x[t1, , tk]

i[i1, , ij, n , tj+2, , tk] let x = n in i[i1, , ij , x , tj+2, , tk]

30


32/52

Either call-by-value or call-by-name For all t t below, we define t = t, t = t,tn = t

n, and tn = tn.

let x = (let y = t in t) in t let y = t in let x = t in t

l1,lne1, ,enn let x = n in l1, ,lne1, ,en

x

l1, ,lme1, ,ej1,n,ej+1, ,emt l1, ,lme1, ,ej1,(byval [](x1, ,xlj )return n(x1, ,xlj )),ej+1, ,em

t

Conjecture 39. The rewrites terminate.

The proof is for future research. We also leave for future research an equivalence on ML+terms. It is hoped that a natural equivalence on ML+ terms t and u is equivalent to observationalequivalence on the CC translation of t and u. We finally leave for future research the followingconjecture.

Conjecture 40. An ML+ term in normal form, with respect to

and either call-by-value orcall-by-name, is uniquely translated to a CC term and program. Conversely: all ML+ terms thatare translatable to CC are in normal form with respect to .

31


33/52

Chapter 4

Relation to lambda calculus

As explained in Chapter 1, lambda calculus partially fulfills the same goals as continuation calculus:it is models functional programs, can reason on them, and is very simple. Continuation calculus ismore powerful in a sense because continuations are expressible in pure continuation calculus. Inlambda calculus, one can only use continuations if all other code is also CPS transformed.

As it happens, continuation calculus is quite similar to a subset of lambda calculus: theterms that are reducible using a novel -reduction. This reduction is equivalent to a series of n-reductions in a row, but is only applicable in specific situations: when the n -reductionscan be done simultaneously on the top level. If we regard only -reduction, then it suddenlybecomes possible to use continuations again, because there are no enclosing terms that an innerterm cannot cancel. Such cancellation can also be done in C by the C or A operator [12], or byinvoking a continuation; it cannot be done in vanilla lambda calculus.

We call the set of lambda terms on which reduction simulates call-by-name reduction.Continuation calculus is much like , but is slightly stricter, and defunctionalized. Every code

point (lambda abstraction) in CC is given a name with a fixed arity. This makes explicit thecontrol points and the data flow between control points. Continuation calculus is slightly moreexplicit in that it disambiguates forms such as x.y.M: is that a function which is supposed totake two arguments, or x.x which happened to have a function substituted in its body? In thesecond case, if the function is called with two arguments, it is easier for the programmer to spotwhere the program went wrong. Furthermore, one can check syntactically if a term is a CC term.This is not the case for .

As it happens, when we simulate lambda calculus in continuation calculus, we can recognizethese differences between CC and in the steps we take. Firstly, we encode the chosen reductionstrategy (by-name or by-value) in the term with the CPS transformation. Secondly, we make dataflow between the control points explicit using the supercombinator transformation. Thirdly, wemake control flow more explicit using defunctionalization.

These steps are summarized in Figure 4.1; we indicate the term sets in which we can programwith continuations. We explain the transformations from to CC and from CC to in more detailin Section 4.1 and 4.2, respectively.

4.1 Embedding lambda calculus in continuation calculus

In this section, we explain how to simulate a lambda term in CC, mimicing either the call-by-namestrategy or the call-by-value strategy. Our approach will use a subset of lambda terms, on whicha new reduction is confluent with the call-by-name reduction.

The approach will consist of three steps:

1. CPS transformation. We use a continuation passing style transformation to transform a

lambda term M to a certain subset of lambda terms, described below. We use M if

32


34/52

C

CC

supercombinator

transformation

CPStransformation

eliminate cyclic

dependencies

supercombinator

SC acyclic CC

defunctionalization

functionalization

Figure 4.1: Relation between lambda calculus, lambda calculus with control, and continuationcalculus (CC). The ellipses form a Venn diagram of term properties. The five dashed arrowsindicate transformations from one circle to another, and are described in subsections. For example,the CPS transformation takes a term in C, which may also be in or be a supercombinator, andthe result of the transformation is a term, which may again be a supercombinator. The shadedareas are the terms where we can program with continuations.

we want to mimic call-by-name lambda calculus, and M if we want to mimic call-by-valuelambda calculus.

Plotkin [25] proves that M resp. M simulate the execution ofM in call-by-name resp. call-by-value. He proves this for slightly different definitions of M and M, which are -convertibleto our definitions.

2. Supercombinator transformation. The resulting lambda term is transformed to a supercombi-nator. (The term supercombinator will be explained shortly.) This step is a expansion.

3. Defunctionalization. Each lambda abstraction is replaced by a name. The lambda applicationsare changed to dot-applications. We make a program with a rule for every lambda abstraction.

Defunctionalization is closely related to lambda lifting [8]: both transform a block-structuredprogram to recursive equations.

We first give the subset of lambda terms, and then we detail the three steps.

4.1.1 The subset

We shall use a variant of lambda calculus as an intermediate language, which we shall call . Theterms of shall be a subset of the terms of lambda calculus. Reduction will be different: thereshall only be the top-level rule , which is new.

Definition 41 (-reduction). For lambda terms M, N, and n 1,

(x1 xn.M) N1 Nn M[ N /x] at the top level

Note that the call-by-name beta rule of lambda calculus simulates : whenever, M N,then M

NN in n steps. The reverse is not true: does not always simulate -reduction. An

application of will correspond to one reduction step in continuation calculus.

33


35/52

Remark 42. Substitution in a lambda term replaces only the free variables. (This is a commondefinition. [26])

We formally define N-reduction, which is our name for -reduction in the call-by-name strategy.Definition 43 (N-reduction). For lambda terms M, N, and n 1,

(x.M) N1 N2 Nn N M[N1/x] N2 Nn at top level

Of course, we may wonder if the and N reductions are always confluent. We know thatif M N N, M N

, then NN N. So if both reductions are possible, then they have a

common reduct. The reductions are thus nonconfluent when N makes progress but fails to.That is the case exactly for these terms:

(x1 xm.M N) t1 tn (m < n) and(x1 xm.xi) t1 tn (m < n)

We define as the set of terms on which and N are confluent.

Definition 44 (). The subset of lambda terms are those lambda terms M such that whenM N in 0 or more steps, and N N N

, then there exists N such that N N,N N N

.

Now because and N reduction are deterministic, the and N reductions are confluent on

terms in . This is provable by induction.

Proving inhabitance of We can find some terms in using a typing. Assume someopaque type , that is: it shall be underivable that , M : M N : for some . Nowconsider lambda terms M such that all subterms x1 xm.N, with N not an abstraction, areof some type 1 m . Then reduction, and in particular and N reduction,

preserves this property. Furthermore, we can never encounter a well-typed term of the form(x1 xm.N) t1 tn, for N not an abstraction and m < n, because (x1 xm.N) t1 tm : and (x1 xm.N) t1 tm+1 : will be in violation of the opacity of .

Note that we did not specify a concrete type system here on purpose. Any conventional typesystem in which is opaque, and which preserves types over reduction suffices.

We now have the sufficient machinery to define each of the three steps to simulate lambdacalculus using continuation calculus.

4.1.2 CPS transformation

The first step of converting terms to CC is to perform a continuation passing style (CPS)transformation. We give two transformations: to simulate terms as by call-by-name, and to

simulate terms as by call-by-value. The result is a term in

that simulates the original term.We prove inhabitance of , but leave the simulation proofs to future work. The simulation proofsare expected to take a form that is very similar to the proof in Plotkin [25].

Call-by-name We define our call-by-name CPS transformation similar to that of Plotkin, as afunction : defined as follows:

x = k.xk

x.M = k.k (x.M )

M N = k.M(m.m N k) ;

for each application of this function, all k,,,m,a,b,c should be taken fresh. This CPS transfor-mation is -convertible to Plotkins transformation for variables, abstraction, and application.

34


36/52

Remark 45. For a complete transformation from C to , the C operator needs a transformation.The author believes that a correct transformation would be C = k.k (a.a (b. b) (c.c)).However, he was unable to develop this belief so far, for instance by working out examples, by atyping as seen below, by similarity to C as researched by others, or by a proof.

Theorem 46. The image of is in .

In the proof, we type the terms using recursive typing [4]. Assume a base type , and arecursive type = ( ); stands for . Then we claim that M : for all closed M.Furthermore, we claim that all the lambda abstractions in the image are, in fact, in .

Proof. We will use a typing context (M) = {x : |x fv(M)}, that is, a context in which all freevariables of M have type .

The proof goes by induction on the size of M. We use the following induction hypothesis:(M) M : M all abstractions inside M are in . This implies that M : forclosed M.

Base case.(x), k : ( ) x : by definition of (x)

(x), k : ( ) x : ( ) by definition of

(x), k : ( ) x k :

(x) k.xk : ( )

(x) k.xk : by definition of

We see that the abstraction x = k.xk is of some type , so x .

Inductive case 1: x.M.

(x.M), x : M : induction hypothesis,

(M) = (x.M) {x : } (x.M), x : M : ( ) by definition of

(x.M), x : , : ( ) M :

(x.M) x.M : ( )

(x.M) x.M : by definition of

(x.M), k : ( ) k (x.M ) :

(x.M) k.k (x.M ) : ( )

(x.M) k.k (x.M ) : by definition of

We have seen that x.M : 1 2 and k.k (x.M ) : 3 for some . Thesetype judgements, together with the induction hypothesis, show that x.M as well as all

contained abstractions are in .

Inductive case 2: M N.

(M) M : induction hypothesis

(N) N : induction hypothesis

(N), k : ( ) , m : m N : = ( )

(N), k : ( ) , m : m N k :

(N), k : ( ) m.m N k : ( )

(M N), k : ( ) M (m.m N k) : (M N) = (M) (N)

(M N) k.M (m.m N k) : ( )

(M N) k.M (m.m N k) : by definition of

35


37/52

We have seen that m.m N k : 1 and k.M (m.m N k) : 2 for some . Thesetype judgements, together with the induction hypothesis, show that M N as well as allcontained abstractions are in .

Call-by-value After the call-by-name CPS translation in the previous paragraph, we give asimilar CPS translation : that simulates terms as call-by-value.

x = k.k x

x.M = k.k

x.M

M N = k.M

m.N (n.m n k)

;

for each application of this function, all k,,,m,a,b,c should be taken fresh. This CPS transfor-mation is -convertible to Plotkins transformation for variables, abstraction, and application.

This CPS transformation is also -convertible to Plotkins CPS transformation for call-by-value.

Remark 47. The author has not found a concrete definition of C yet that is compatible with thenext theorem. Such definition ofC is necessary to make a transformation from C to . However,the author is convinced of its existence through similar results by others. [12]

Theorem 48. The image of is in .

The proof is similar to the proof of Theorem 46. We type the terms using recursive typing.Assume a base type , and a recursive type = . Then we claim M : for all closedM. Furthermore, we claim that all the lambda abstractions in the image are in .

Proof. We will use a typing context (M) = {x : |x fv(M)}: all free variables of M have type.

The proof goes by induction on the size of M. We use the following induction hypothesis:(M) M : M all abstractions inside M are in .. This implies that M : for

closed M.

Base case.

(x), k : k x :

(x) k.k x :

We have seen that x = k.k x is of some type 1 2 , so x .

Inductive case 1: x.M.

(x.M), x : M : induction hypothesis,

(M) = (x.M) {x : }

(x.M), x : , : M k : (x.M) x.M k :

(x.M) x.M k : by definition of

(x.M), k : k

x.M k

:

(x.M) k.k

x.M k

:

We have seen that x.M = k.k

x.M k

: 1 2 and x.M k : 3 4 for

some . These type judgements, together with the induction hypothesis, show that x.M aswell as all contained abstractions are in .

36


38/52

Inductive case 2: M N.

(M) M : induction hypothesis

(N) N : induction hypothesis

m : , n : , k : m : by definition of

m : , n : , k : m n :

m : , n : , k : m n k :

m : , k : n.m n k :

(N), m : , k : N (n.m n k) :

(N), k : m.N (n.m n k) :

(M N), k : M

m.N (n.m n k)

: (M N) = (M) (N)

(M N) k.M m.N (n.m n k) : We have seen three type judgements: M N = k.M

m.N (n.m n k)

: 1 2 ,

m.N (n.m n k) : 3 , and n.m n k : 4 , for some . The three type judgements,together with the induction hypothesis, show that M N as well as all contained abstractionsare in .

4.1.3 Supercombinator transformation

The second step in simulating lambda calculus with continuation calculus is to make data flowexplicit, or in concepts: we make sure that abstractions have no free variables. This notion iscaptured by supercombinators, which were originally conceived by Hughes [16] as a useful subsetof terms to perform optimizations on.

Definition 49. A supercombinator S of arity n is a lambda expression of the form

x1x2 . . . xn.E (n 0)

where E is a variable or an application, such that

1. S has no free variables, and

2. All lambda abstractions inside E are supercombinators.

By lambda abstractions inside E, we mean the lambda abstractions that are not direct childrenof other abstractions in E.

We rephrase a transformation by Bird [5] that transforms lambda terms to a supercombinator;

this supercombinator is a expansion of the lambda term.

SC(x) = x

SC(M N) = SC(M)SC(N)

SC(x.M) = (y.x.SC(M)) y

where y are the free variables of x.M

Conjecture 50. If M , thenSC(M) . Furthermore, if M is closed, thenSC(M) is too.

4.1.4 Defunctionalization

The resulting lambda term SC(M) after two steps is a supercombinator in . The third and finalstep in our simulation is to convert SC(M) to a CC program and CC term. This practice is closelyrelated to lambda lifting [8].

37


39/52

The procedure goes as follows. Let A be the set of lambda abstractions within SC(M) thatare not direct children of another lambda abstraction. By the supercombinator assumption, allsuch m A are closed. We generate a name Nm for each m A. We use the following translation

from terms to CC terms. (From here on, M will refer to any lambda term, not specifically theoriginal input lambda term.)

ccify(x) = x

ccify(M N) = ccify(M).ccify(N)

ccify(x! xn.M) = Nx1xn.M if M is not an abstraction

We take the following program:

P = {Nx1 xn.M.x1 . .xn ccify(M)|x1 xn.M A and M not an abstraction} .

Note that because all terms in A are closed, all CC variables in the right hand sides also occur in

their left hand sides. Furthermore, becauseSC

() is a supercombinator, it follows straightforwardlythat there are no variables in ccify(SC()), so ccify(SC()) is a well-formed CC term. (Recall thatCC terms do not contain variables.)

Theorem 51. For any supercombinator lambda term E, continuation calculus reduction on ccify(E)closely simulates reduction on E. By closely simulates, we mean that for all E, one of twocases applies. Either E E ccify(E) CC ccify(E), or both ccify(E) CC and E E

for some E in normal form.

In the first case (bi-implication), we see that CC simulates -reduction in a single step. In thesecond case, both the and CC executions terminate, but the CC execution terminates one stepearlier.

Proof. The proof will have the following structure. In the first part, we assume that E E

for some E. We distinguish two cases, and prove either ccify(E) CC ccify(E) (and thus theright-implication), or ccify(E) CC E (and thus the right disjunct). In the second part, weassume ccify(E) CC ccify(E) for some E and prove E E. The two parts together sufficeto prove the theorem.

So for the first part, let us further examine the direction. A term in reduces using ifthe term is of the following shape:

E = (x1 xn.M) N1 Nn E = M[ N /x] (n 1)

We distinguish two cases.

Case 1. M = xn+1.M. Then there is no -reduction possible on M[ N/x] = xn+1.M

[ N /x]. Incontinuation calculus, arity(Nx1xn+1.M) n+1, so already ccify(E) = Nx1xn+1.M .N1. .Nn

halts. Continuation calculus thus halts one step earlier, and CC reduction on ccify(E) closelysimulates reduction on E.

Case 2. M is not an abstraction. Then ccify(E) = Nx1xn.M.ccify(N1). .ccify(Nn), and thereis a rule Nx1 xn.M.x1 . .xn ccify(M). Continuation calculus reduces the term:

Nx1xn.M.ccify(N1). .ccify(Nn) CC ccify(M)[ccify(N1)/x1, , ccify(Nn)/xn]

With Lemma 52 below, we get the following equivalent proposition:

Nx1xn.M.ccify(N1). .ccify(Nn) CC ccify(M[ N /x])

We thus have what was to be proven:

ccify(E) CC ccify(E)

38


40/52

For the direction, first note that ccify() is injective. By virtue of being a lambda term, E isof the general form E = (x1 xk.M) N1 Nn with k 0, n 0 and M not an abstraction.CC reduction of ccify(E) is only possible when k = n: ccify(E) CC ccify(M)[ N /x]. But ifk = n,

then E M[ N /x]. Lemma 52 will now prove ccify(M)[ N /x] = ccify(M[ N /x]).This concludes the proof, on the premise of Lemma 52.

Lemma 52. For any lambda term M with fv(M) {x1, , xn} such that all abstractions inside

M are supercombinators1, and for all lambda terms N:

ccify(M)[ccify(N1)/x1, , ccify(Nn)/xn] = ccify(M[ N /x]).

Proof. By induction on M.

Base case: M = some xi. Then ccify(xi) = xi. Consequently,

ccify(xi)[ccify(N1)/x1, , ccify(Nn)/xn]

= xi[ccify(N1)/x1, , ccify(Nn)/xn]= ccify(Ni)

= ccify(xi[ N /x]).

Inductive case 1: M = y1 ym.N. As M is a supercombinator, it has no free variables,and as substitution replaces only free variables, any substitution applied to M equals M.Furthermore, ccify(M) = NM, which has no variables, so any substitution applied to ccify(M)equals ccify(M). It remains to be trivially proven that ccify(M) = ccify(M).

Inductive case 2: M = t u. Then fv(t) fv(u) {x1, , xn}, and we have the followingderivation.

ccify(t u)[ccify(N1)/x1, , ccify(Nn)/xn]

= (ccify(t).ccify(u)) [ccify(N1)/x1, , ccify(Nn)/xn] by definition of ccify( )

= ccify(t)[ccify(N1)/x1, , ccify(Nn)/xn]

. ccify(u)[ccify(N1)/x1, , ccify(Nn)/xn] by definition of substitution

= ccify(t[ N/x]) . ccify(u[ N /x]) by induction

= ccify(t[ N/x] u[ N /x]) by definition of ccify( )

= ccify((t u) [ N /x]) by definition of substitution

4.2 Embedding continuation calculus in lambda calculus

In this section, we explain how to simulate continuation calculus using lambda calculus. Ourembedding will be simple (dot is translated to -application), but the embedding will be partial,due to a fundamental difference between lambda and continuation calculus: application of toomany arguments must stop execution in continuation calculus. Checking whether the right numberof arguments is applied would involve additional bookkeeping to keep track of the number ofarguments. We feel that such bookkeeping is often unjustified, so we give the simple embedding ofCC in .

The embedding consists of two phases: first, we e

continuation calculus - thesis bram geron

Documents