formal construction of a set theory in coq · a formalisation of intuitionistic zermelo-fraenkel...

Saarland UniversityFaculty of Natural Sciences and Technology I

Department of Computer Science

Masters Thesis

Formal Constructionof a Set Theory in Coq

submitted by

Jonas Kaiser

on November 23, 2012

Supervisor

Prof. Dr. Gert Smolka

Advisor

Dr. Chad E. Brown

Reviewers

Prof. Dr. Gert Smolka

Dr. Chad E. Brown

Eidesstattliche Erklarung

Ich erklare hiermit an Eides Statt, dass ich die vorliegende Arbeit selbststandigverfasst und keine anderen als die angegebenen Quellen und Hilfsmittel verwendethabe.

Statement in Lieu of an Oath

I hereby confirm that I have written this thesis on my own and that I have not usedany other media or materials than the ones referred to in this thesis.

Einverstandniserklarung

Ich bin damit einverstanden, dass meine (bestandene) Arbeit in beiden Versionenin die Bibliothek der Informatik aufgenommen und damit veroffentlicht wird.

Declaration of Consent

I agree to make both versions of my thesis (with a passing grade) accessible to thepublic by having them added to the library of the Computer Science Department.

Saarbrucken,(Datum/Date) (Unterschrift/Signature)

iii

Acknowledgements

First of all I would like to express my sincerest gratitude towards my advisor, ChadBrown, who supported me throughout this work. His extensive knowledge andinsights opened my eyes to the beauty of axiomatic set theory and foundationalmathematics. We spent many hours discussing the minute details of the variousconstructions and he taught me the importance of mathematical rigour.

Equally important was the support of my supervisor, Prof. Smolka, who firstintroduced me to the topic and was there whenever a question arose. During hiscourse on formal logic I first experienced the delight of working with the proofassistant Coq.

I am also massively grateful for the countless more or less formal discussionswith my colleagues at the Programming Systems Lab and throughout the univer-sity. They helped to keep up my spirits over the longer stretches and providedvaluable second perspectives on my work.

Last but not least I would like to thank my friends and family for their unwaver-ing support throughout the writing of this thesis.

iv

Abstract

In this thesis we present a formalisation of Tarski-Grothendieck set theory, thatis, Zermelo-Fraenkel set theory with Grothendieck universes. The developmentis executed in Coq, extended with the law of the excluded middle and a powerfulchoice principle. This yields a powerful metatheory where the inhabitance of everytype is decidable. From the axiomatisation we develop the usual theory of sets,including restricted comprehension, ordered pairs, functions and finite ordinals.

The formalisation is designed to support the construction of proof-irrelevant,classical models for type theories like the Calculus of (co)inductive Constructions.The actual model constructions are left for future work.

v

Contents

1 Introduction 1

2 Metatheory 52.1 CiC and ECC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.2 Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.3 Classical Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . 102.4 Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112.5 Decidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3 Axiomatisation of Tarski-Grothendieck 153.1 Sets and Membership . . . . . . . . . . . . . . . . . . . . . . . . 163.2 Inclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.3 Extensionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.4 ∈-Induction & Regularity . . . . . . . . . . . . . . . . . . . . . . 193.5 The Empty Set . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.6 Unordered Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . 213.7 Unions of Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223.8 Powerset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.9 Replacement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233.10 Grothendieck Universes . . . . . . . . . . . . . . . . . . . . . . . 243.11 Infinity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263.12 Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4 Development of the TG set theory 314.1 Singleton and Basic Sets . . . . . . . . . . . . . . . . . . . . . . 314.2 Binary Unions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354.3 Union over a Family of Indexed Sets . . . . . . . . . . . . . . . . 36

vii

4.4 Separation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 384.5 Ordered Pairs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.6 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 474.7 Finite Ordinals & Natural Numbers . . . . . . . . . . . . . . . . 55

5 Conclusion 635.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 645.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

viii

CHAPTER 1

Introduction

The topic of this thesis is the formalisation of a set theory which admits the con-struction of set-theoretic models for type theories.

Both type theory as well as set theory concern themselves with the foundationsof logic and mathematics albeit from rather different points of view. Before weintroduce the material of this work in any detail it seems prudent to briefly surveythe developments, formalisms and philosophies of the two camps. This overviewis certainly not exhaustive and there is a wealth of secondary literature availablefor the interested reader. Our intention is simply to provide a small amount ofhistorical context for the developments presented herein.

Set theory can be traced back as far as 1874 to the work of Cantor [Can74] andthe mantra of this camp is “everything is a set”. All the constructions familiar frommathematics can be built from sets and all operations on them can be encodedas operations on sets. A major achievement in this line of work was Zermelo’sfirst axiomatisation in 1908 [Zer08]. Fuelled by this result, the first third of the20th century saw a lot of work around the construction of the best possible ax-iomatisation. The goal was to maximise the expressiveness of the theory withoutintroducing any of the known inconsistencies like the Russell set [Rus02], and withas few axioms as possible. Beyond these mutually agreed upon criteria there wasa lot of debate on which group of axioms was the “right” one. This debate andthe studies it entailed led to the extensive knowledge we posses today about therelative strengths and weaknesses of the various axiomatisations. A set theory thatstood the test of time and is still in use today is Zermelo-Fraenkel set theory withthe Axiom of Choice (ZFC). It is a cleaned-up variant of Zermelo’s original workwith a major modification introduced by Fraenkel in 1922 [Fra22]. The axiom of

1

Chapter 1. Introduction

choice, although already part of Zermelo’s original collection of axioms, is men-tioned explicitly since it tends to be the most controversial one. Accepting it intothe theory has far reaching consequences and leads to rather unexpected effects,e.g., consider the Banach-Tarski paradox [Wag85]. When people speak about settheory without explicitly mentioning an axiomatisation they usually have ZFC inmind. In 1938 Tarski augmented ZFC with the assumption that there exist largeinaccessible cardinals [Tar38] and later Grothendieck formulated a set-theoreticnotion of universes, which when added to ZFC provided a theory equivalent toTarski’s variant, albeit without the need to explicitly talk about the concept of car-dinality [Gab62, Kru65, GV72]. The resulting theory is called Tarski-Grothendieckset theory (TG) and its formalisation makes up the bulk of this work.

Type theory on the other hand mostly focuses on the foundation of logic andconcerns itself with the question of “what exactly is a proof”. This question is ofcourse as old as logic itself, but new answers began to appear after Brouwer chal-lenged the justification for the law of the excluded middle [Bro23] and introducedintuitionism. Based on this, Kolmogorov and Heyting investigated the nature ofproofs further [Kol32, Hey74] and thus developed what is now known as the con-structive or intuitionistic approach to logic. Its key feature is the absence of thelaw of the excluded middle. The basic idea is that to give a proof one has to ex-plicitly construct a proof object, e.g., for a (binary) conjunction one has to providea pair of proof objects, one for each conjunct. The most influential insight wasthat the proof of an implication is a procedure that transforms any proof object forthe antecedent into a proof object for the conclusion, i.e., essentially a function onproof objects. What makes this development interesting is the close correspond-ence between intuitionistic logic and typed, computational calculi, like the simplytyped λ-calculus or Girard’s System F [GTL89]. The Curry-Howard correspond-ence [CF58, How80] or more generally the Propositions as Types principle statesthat theorems can be seen as types and proofs as terms. Then stating that a term hasa type is essentially the same as presenting the proof of a theorem. In other words,a theorem is proved by the proofs that inhabit it. The use of typed, functionalcalculi as a logic is then the core doctrine of type theory. There are other type the-ories next to System F, like Coquand’s and Huet’s Calculus of Constructions (CC)[CH85, CH88] or Martin-Lof’s predicative Type Theory (MLTT) [ML75]. Ideastaken from all of these resulted in Luo’s Extended Calculus of Constructions (ECC)[Luo94]. Then with the addition of inductive types and further modifications wearrive at the Calculus of (co)inductive Constructions (CiC) [PPM89]. It is the logicimplemented in the proof assistant Coq1, which has been used for several extens-ive, machine-verified proofs, not least among which are a fully verified C-compiler[BDL06, Ler09b, Ler09a] and the four colour theorem [Gon07].

It is natural to ask how these two distinct approaches relate to each other. Toanswer this question one usually takes one formalism and constructs a model foranother. Given that there are many formalisms and various kinds of models this

1A recent version and associated documentation can be found at http://coq.inria.fr/

2

http://coq.inria.fr/

is admittedly a rather broad statement. To make this more precise we have to askourselves what exactly we want to deduce from the construction of the model andthe answer will fall into one of two categories.

The first thing we may ask is: What is the proof-theoretic strength of one form-alism relative to another, i.e., is formalism A strong enough to model formalism B.These kinds of models are useful to investigate the relative consistency of a form-alism and in general one has to be very careful about the assumptions and axiomsused throughout the model construction.

Since the models which aim at proof-theoretic strength tend to be as minimal aspossible, they are usually ill suited to provide an intuition for, say, the set-theoreticsemantics of a type theory, or vice versa. To achieve the latter it is common tointroduce additional assumptions with the hope to construct a simple model witheasy and straightforward definitions. These are the second kind of models and tosome degree one can still reason about consistency but the obtained answers arenot as strong as with the first kind.

A lot of work has been conduct in both of these areas. A good example fora comparison of the first kind is the rigorous work of Aczel [Acz98] as well asthe work of Werner [Wer97] which investigate the relation between the predicativetype universes of CiC and ZFC with inaccessible cardinals. The latter is partiallyformalised in Coq. Barras’ development [Bar10], also formalised in Coq, on theother hand is a prime example for the second kind of investigation. There are ofcourse a lot more and we will see a few of them in the section on related work.

Why do we care about the formalisation of such model constructions? One be-nefit is that we cannot gloss over “supposedly trivial” sections of the developmentwhich can bring unforeseen and subtle problems in existing semantics to light. Aparticular such case was revealed by Miquel and Werner [MW02]. With respectto “simple” proof-irrelevant models (where the model interprets propositions withone of exactly two objects) of CC they observed that the combination of impredic-ativity and subtyping led to unforeseen complications. Lee and Werner present apartial solution to this issue, which relies on an altered notion of equality [LW11].

Motivated by the recent work of Barras [Bar10, Bar12] we formalise Tarski-Grothendieck set theory with the goal to eventually construct proof-irrelevant mod-els for the type theories ECC and later CiC. In his development Barras works witha formalisation of intuitionistic Zermelo-Fraenkel set theory (IZF) in Coq and con-structs models of CiC. We deviate from his work in that we not only use ZFCinstead of IZF but we assume a classical metatheory right from the start. Ourmetatheory is of course considerably stronger than the one Barras is using. Wehope that this will enable us to construct a simple and easy to understand, classicalmodel. In both cases one may wonder at the inherent circularity of the approach,i.e., we take an implementation of CiC and therein formalise a set theory which inturn is used to model CiC. To this we answer that the scenario is only problematic

3

Chapter 1. Introduction

if we were to make claims with respect to the proof-theoretic strength of the mod-elled type theories. Since this is not our goal we are not considering the issue anyfurther.

The proof-irrelevant, classical models we want to construct should allow us toreason about the mutual consistency of Coq’s standard library. This means thatwe want to construct models where the axioms found in the library are interpretedby inhabited sets. The actual model constructions are not part of the present textbut are left for future work. This main purpose is relevant though since it guidesseveral of the design decisions we make along the way.

From a logical point of view we start with a reasonably well understood intu-itionistic type theory, namely CiC. This type theory is then consistently extendedto a classical logic with the law of the excluded middle and then equipped with astrong choice principle. In combination this yields a fairly powerful metatheory,which by no means should be called computational anymore (the inhabitance ofevery type is decidable). The next major step is to formally introduce the axiomsof TG, which adds an internally untyped set theory into our framework. On top ofthis machinery we then perform the development of the remainder of the set theory.

The remainder of the document is structured as follows. In Chapter 2 we take acloser look at the metatheory underlying our development. Since we are workingwith the Coq proof assistant, this will include a brief discussion of the type theoriesCiC and ECC. We will also consider Coq’s basic logic together with the additionof the law of the excluded middle and a choice axiom. The chapter concludes witha discussion of the consequences of combining these two additional axioms.

In Chapter 3 we then motivate and present our axiomatisation of Tarski-Grothendieck set theory, both mathematically and on the level of our Coq form-alisation. Some of the more interesting axioms are also discussed in the light ofhistorical context.

With the axiomatisation laid out, Chapter 4 then develops further set-theoreticconstructions of which some are just generally useful, while others are providedand designed with respect to the main purpose of this development, namely theconstruction of models for a number of type theories. Of particular interest here areSection 4.6 on the Aczel function encoding as well as Section 4.7 which providesa construction of finite ordinals.

Finally, Chapter 5 concludes this work with a discussion of related efforts in thisfield and a brief outline of how we intend to make use of this formalisation in futuredevelopments.

4

CHAPTER 2

Metatheory

In this thesis we are going to provide a formal development of a set theory. For thisstatement to make any sense at all we first have to clearly pin down the metatheorywithin which we will be working. This will allow us to put the development incontext. We will later also discuss issues of consistency and for any reasonablyexpressive system we know that such arguments will always be of a relative nature.This provides another justification for making the background logic precise.

We will be formalising the development with the proof assistant Coq which im-plements the Calculus of (co)inductive Constructions (CiC for short). This logicis an intuitionistic type theory and we will discuss it first. We will augment thisbasic system with two additional axioms so that we effectively work with a fairlypowerful, classical logic.

The presentation of CiC serves another purpose as well. As we pointed out inthe introduction we design our set theory for a very particular purpose, namely theconstruction of a formal, set-theoretic model for exactly this calculus. Hence thefollowing presentation can also be understood as a “design guide” for the constructsand properties we eventually expect from our set theory. Especially the formaldefinition of the ECC fragment of CiC’s typing relation is instructive.

5

Chapter 2. Metatheory

2.1 CiC and ECC

The Calculus of (co)inductive Constructions is a logic build around the proposi-tions as types principle. As such the Calculus is more often referred to as a typetheory. CiC is built from Luo’s Extended Calculus of Constructions [Luo94] (here-after ECC) by adding inductive and coinductive types [PPM89]. We will only dis-cuss Luo’s presentation of ECC in any detail since the added power of inductivetypes is only marginally used. To be precise, we are going to use only the inductiveproposition inhabited T , strong sums and the inductive type of natural numbers,nat. Technically, the last could be avoided but it makes our life easier in the finalpart of the development, while the strong sums are already part of ECC, albeit notas an inductive type. ECC provides us with the following language of terms:

• The constants Prop and Type0, Type1, Type2, . . . are terms, called uni-verses, where the subscript of Typei is referred to as the universe index

• Variables (x, y, . . .) are terms.

• Let M, N, A and B be terms, then each of the following is a term:

– Πx : A, B, called dependent product

– λx : A, N, called abstraction

– M N, called application

– Σx : A, B, called strong sum

– pairΣx:A, B(M,N), called pair

– π1M, called first projection

– π2M, called second projection

In contrast to earlier theories (e.g., the simply typed lambda calculus or Girard’sSystem F [GTL89]) the calculus does not make a syntactic distinction betweenterms and types, which means that terms are used to play both roles. The theory isequipped with a ternary typing judgement of the form Γ ` a : A, which ascribesthe term A as a type to the term a under the typing environment Γ (when Γ isempty we simply write ` a : A). The judgement is build up inductively from therules in Figure 2.1 and the rules make use of a subtyping relation (�). The mostimportant fact about this relation is that it provides a total order for the universes:Prop � Type0 � Type1 . . . The remaining details can be found in Luo’s book[Luo94].

At its heart, the Coq proof assistant is simply a type checker that confirms orrejects the existence of such judgement triples with respect to the typing rules ofCiC. Practically we are also provided with a number of so-called tactics whichguide and simplify the construction of terms inhabiting certain other terms.

6

2.1. CiC and ECC

Ax` Prop : Type0

CΓ ` A : Type j

Γ, x : A ` Prop : Type0(x < FV(Γ))

TΓ ` Prop : Type0

Γ ` Type j : Type j+1VarΓ, x : A, Γ′ ` Prop : Type0

Γ, x : A, Γ′ ` x : A

ImpredΓ, x : A ` B : Prop

Γ ` Πx : A, B : PropProd

Γ ` A : Type j Γ, x : A ` B : Type j

Γ ` Πx : A, B : Type j

LamΓ, x : A ` M : B

Γ ` λx : A. M : Πx : A, BAppΓ ` M : Πx : A, B Γ ` N : A

Γ ` M N : [N/x]B

SigmaΓ ` A : Type j Γ, x : A ` B : Type j

Γ ` Σx : A, B : Type j

PairΓ ` M : A Γ ` N : [M/x]B Γ, x : A ` B : Type j

Γ ` 〈M,N〉Σx:A, B : Σx : A, B

Proj1Γ ` M : Σx : A, B

Γ ` π1(M) : AProj2

Γ ` M : Σx : A, B

Γ ` π2(M) : [π1(M)/x]B

SubΓ ` M : A Γ ` A′ : Type j

Γ ` M : A′(A � A′)

Figure 2.1: The typing rules of Luo’s ECC.

With this understanding of the basic machinery of Coq we are going to makeprecise the terminology we will use. So far we have only seen terms (and of coursecontexts) while the notion of a type was only mentioned loosely. So what exactlyis a type? We are going to call a term A a type if there exists a universe index isuch that the judgement ` A : Typei holds. Similarly we are going to call a termA a proposition if the judgement ` A : Prop holds. From the subtyping rulesit follows that a proposition is also a type, albeit a rather special one. When wesay something is a type we usually mean a non-propositional type, although wehighlight the distinction where necessary.

It is instructive to consider how these notions of type theory are related to con-cepts found in standard mathematics. There we have a number of objects like 0,1 or true and false. These objects are grouped together to form something that isloosely called a set or class, with varying degrees of formalism behind the respect-

7


ive expressions. We encounter things like “the set of natural numbers” or “false isa boolean value” and would like to make these notions precise.

This is achieved with types. As an example we consider the type of naturalnumbers N, i.e., we have ` N : Type0, which allows us to formally state that 0 is anatural number as ` 0 : N. Similarly we have ` B : Type0 for boolean values and` true : B.

Apart from these objects we are usually faced with statements or, to be precise,properties of these objects, e.g., 5 = 7 or true ∨ false, which may or may not beprovable.

In our system these statements are those special types we call propositions andtheir inhabitants are the corresponding proofs. A proposition that does not hold isof course empty, i.e., there are no inhabitants that act as proofs. So we have theformal judgements ` true∨ false : Prop and ` p : true∨ false (where p is a proofof the disjunction) and ` 5 = 7 : Prop but 6` q : 5 = 7 for any q.

As a consequence there are two separate ways to interpret the two terms in thejudgement ` a : A depending on whether ` A : Prop or ` A : Typei (one of thesehas to be the case, since otherwise ` a : A would not hold). In the first case we readA as some mathematical statement and a as a proof of this statement. In the secondcase it makes more sense to understand A as a class of mathematical objects and aas one particular such object.

We will illustrate this distinction with a small example. When working withCoq we are often talking about the inhabitance of types, i.e., we are dealing withthe question if some term s exists such that ` s : T for a given type T . Now T isa type but the statement about its inhabitance is a property of that type and shouldbe seen as a proposition. It is something we may be able to prove. Coq uses aninductive proposition to encode this:

Inductive inhabited (A : Type) : Prop B inhabits : A→ inhabited A.

So if we actually have a particular member s then the fact that T is inhabitedis expressed in Coq with the judgement s : T ` inhabits s : inhabited T , whichshould be read as “s is an object in the type T and therefore inhabits s is the proofof the proposition inhabited T”.

While this distinction is mostly cosmetic from a mathematical point of view, itleads to very real consequence in the context of type theories like those we have inCoq. At the heart of the problem lies the way implications or respectively functionspaces are expected to behave at Prop and at Typei. In type theory, implicationsand function spaces are the same thing. The problem though is that for any ` B :Prop we expect ` A → B : Prop. This property is called impredicativity and fora single universe this is perfectly fine but when we extend this to any of the otheruniverses the system is known to become inconsistent [Gir72, Hur95].

Another aspect of this division is that everything which lives in Typei can beextracted to executable code. Hence we tend to call this part of the theory thecomputational fragment.

8

2.2. Logic

To preserve this clear boundary between the logical, impredicative universeProp and the computational universes Typei Coq employs a mechanism whichwe are going to call the elim-restriction. The system does not allow us to eliminatean inductive proposition into a type. In other words, this means that in the pro-cess of constructing an element of a type Coq only allows us to use the fact that aparticular proposition holds, but not the details of its proof.1

In our development we are going to assume axioms that weaken this boundarysomewhat. As a consequence we will lose the capability to extract code, i.e., ourmetatheory cannot justly be called computational, but we retain consistency. Sincewe never planned to use the extraction feature this is harmless and we will notconcern ourselves with issues of computability in the remainder of this work.

2.2 Logic

Coq’s universe Prop is the impredicative universe of properties and we presentthe basic logical connectives here. Universal quantification and implication areprimitive in Coq and they come with corresponding abstraction and applicationmechanisms. They are the dependent and non-dependent function spaces in theuniverse Prop. We write ∀x : A, B and respectively A→ B for these.

With the help of inductive propositions we are then given the false proposi-tion, negation, conjunction, disjunction, existential quantification and equality aspresented here to complete our logical tool set.

Inductive ⊥ : Prop B .

Definition not : Prop→ Prop B λ A : Prop . A→ ⊥.

Inductive and (A B : Prop) : Prop B conj : A→ B→ A ∧ B.

Inductive or (A B : Prop) : Prop Bor introl : A→ A ∨ B | or intror : B→ A ∨ B.

Inductive ex (A : Type) (P : A→ Prop) : Prop Bex intro : forall x : A, P x→ ex P.

Inductive eq (A : Type) (x : A) : A→ Prop B eq refl : x = x

Each of these behaves exactly as one would expect, i.e., we are given appropriateintroduction and elimination rules. As an example we can consider a disjunctionA ∨ B among our assumptions. The elimination rule for disjunctions now allowsus to split the proof in two halves, one with the assumption A and one with theassumption B. Conversely, to show a disjunction, we can use either of the two

1The actual mechanism is slightly more involved and a certain number of propositions are exemptfrom this restriction. Further details can be found in the Coq Reference Manual, located at http://coq.inria.fr

9

http://coq.inria.fr

http://coq.inria.fr


constructors, which are the introduction rules. The other definitions are similar.The inductive equality presented here is provably equivalent to Leibniz equality. Itis an equivalence relation that allows to substitute equals for equals in the formaldevelopment.

It is interesting to observe that each of these constructions has a counterpartthat lives in the universe of types. We look at two of these in particular, namelysums which correspond to disjunctions and sigmas corresponding to existentialquantification.

Inductive sum (A B : Type) : Type Binl : A→ A + B | inr : B→ A + B.

Inductive sig (A : Type) (P : A→ Prop) : Type Bexist : forall x : A, P x→ sig P.

The crucial thing to note is that while A ∨ B is a proposition, A + B is a non-propositional type. And despite the syntactic similarity of their definitions they arenot the same. Assume we are constructing an inhabitant of some type. Then ifwe are given an inhabitant of A + B Coq allows us to perform a case distinctionon whether this inhabitant was constructed with inl or inr . If, on the other hand,we only have the assumption A ∨ B, such a case split is prevented by the elim-restriction, as outlined at the end of Section 2.1.

The reader may have noticed that we introduced a more convenient notation forthe defined logic operators. In particular A ∧ B, A ∨ B, A + B and x = y corres-pond to and A B, or A B, sum A B and eq x y respectively. For negation we will use¬ A instead of A→ ⊥. In addition we will also use a more convenient form forexistential quantification and sigma types, which each involve a binding operation.We write exists x : A, P for ex A (λ x . P), where P can have free occurrences ofx, and similarly for sigma types we have {x : A | P } in place of sig A (λ x . P).When the type of the bound variable can be inferred, we may drop it.

2.3 Classical Logic

The type theory CiC underlying Coq is an intuitionistic, or constructive logic. Thismeans that we are not by default given the law of the excluded middle. The lawstates that every proposition either holds or its negation holds. We should be carefulto distinguish this statement about the inherent truth-value of all propositions fromnotions of provability, i.e., it does explicitly not claim that all propositions areprovable or not. The inclusion of this axiom changes the intuitionistic logic to aclassical logic and there are good reasons for as well as against it.

On the one hand there seems to be a connection between the law of the excludedmiddle and computational control operators like call/cc and unguarded gotojumps [Gri90]. Since adding these operators to a programming language tends tohave subtle and far reaching consequences, some people are sceptical with respect

10

2.4. Choice

to the addition of the law on the logic side. The intuitionistic objection is that the(proven) absence of a proof object for ¬P does not “magically” generate a proofobject for P.

On the other hand it appears that this principle is an inherent ingredient in theway most people tend to reason in everyday situations, usually without noticingthe use of a logical axiom. Similarly most presentations of mathematical proofstacitly assume it. On these grounds it is fair to argue that classical reasoning ismore natural, or intuitive, than intuitionistic reasoning (despite the name).

Due to this disagreement about which kind of logic can be called the “right”logic, the Coq system was designed to accommodate both views. In its basic setup,Coq’s underlying theory is intuitionistic, but in such a way that it is compatiblewith the law of the excluded middle. This means that we can add the law as anaxiom without breaking the consistency of the system.

We decided to include the law with the hope to simplify some of the proofsand maybe discover helpful shortcuts. In the context of constructing a proof wecan now perform a case split of whether a particular proposition holds or does nothold, generating additional assumptions. Formally the axiom looks as follows.

Definition XM : Prop B forall P : Prop, P ∨ ¬P.

Axiom classic : XM.

This is not exactly the way the axiom is presented in the library, where a singlestatement is used, but it is useful to name the type ∀P : Prop, P ∨ ¬P, for reasonsof presentation.

In the following, many of the proofs are presented from a higher-level, math-ematical point of view where uses of this axiom are not explicitly mentioned. Thereader should be aware however that when a proof says “Either X holds or Y” weare often implicitly using it. If there is any doubt to whether it was used of not,then the Coq proof script should be considered as the authoritative source.

2.4 Choice

We will also require some form of choice principle since a few of the later con-structions require a case split already at the point of definition. In the next sectionwe will see how this is achieved, but let us first look at choice principles in general.

Intuitively, they allow us to extract an explicit witness satisfying a particularpredicate from the knowledge that some unspecified witness exists. In the light oftype theory this means that we can map existentials (which live in Prop) to sigmas/ strong sums (which live in Type).

For our development we select the strongest choice axiom, known as Hilbert’sepsilon statement, from the library:

11


Axiom ε statement : forall (A : Type) (P : A→Prop),inhabited A→ { x : A | (exists x, P x) → P x }.

So what does this give us? For any inhabited type A and predicate P on this typewe can obtain a particular dependent pair. Its first component is a member of A andits second component is a proof that the predicate holds for this particular member,as long as we can establish that the predicate holds at least for some member.

In practice we are usually working with the left and right projections of this pair,and not directly with the statement itself. The first projection is a higher-orderconstruct similar in flavour to the ε-operator introduced by Hilbert in the contextof first order logic [Ack25]:

Definition ε (A : Type) (i : inhabited A) (P : A→Prop) : AB proj1 sig (ε statement P i ).

The second projection is the proof method to establish that we are really obtain-ing something meaningful with our ε-definition:

Definition ε spec (A : Type) (i : inhabited A) (P : A→Prop) :(exists x, P x) → P (ε i P)B proj2 sig (ε statement P i ).

We will occasionally consider the scenario where we have some particulartype T for which the inhabitance has been established, i.e., we have some fixedi : inhabited T . In this case we can form the partial application ε i, which weare going to denote by εT

2. When we have established the existence of such aspecialised εT we will informally speak of having an “ε at type T”. Note thatCoq implicitly infers the first type argument from the inhabitance proof. Let us gothrough a few scenarios and see how the axiom behaves at different types.

Example 1 (T is empty). Since T is empty we cannot prove inhabited T, andtherefore we do not have an ε at T . This makes sense, as an ε at T would allow usto produce elements in our supposedly empty type T .

Example 2 (T B N). First of all we can set i : inhabited T B inhabits 5 (or useany other number), so we do have εN. We will now look at a number of predicatesP : N→ Prop.

Case 1: (P B λx. x = 2 ∗ x) It is clear that this predicate is satisfied only by0, i.e., we can prove P 0 and ∀x ∈ N, P x → x = 0. Looking at the statement ofthe axiom we know that εNP : N. Since the predicate is satisfiable we also knowfrom ε spec that P(εNP). With the information at hand we can make this even moreconcrete and infer εNP = 0.

Case 2: (P B λx. x ≤ 1) Here we have another satisfiable predicate, i.e., wecan prove P 0 and P 1. As before we have εNP : N and we also have P(εNP).Interestingly we will be able to prove εNP = 0∨εNP = 1, however neither εNP = 0

2In particular we will be looking at εset, the choice operator at the type of our formalised sets.

12

2.5. Decidability

nor εNP = 1 is provable on its own. This highlights the fact that εN does pick asatisfying witness, if possible, but it does not reveal which one it chose if there weremultiple choices.

Case 3: (P B λx. x = 2 ∗ x ∧ x , 0) It should be clear that this time there isno possible natural number satisfying the predicate. We still obtain εNP : N, butP(εNP) will not be provable. In fact its negation, ¬P(εNP), is provable.

2.5 Decidability

It is interesting to consider the consequences of assuming both the law of the ex-cluded middle and the choice principle presented above. Together they entail an in-formative variant of excluded middle. This means that we can lift the classical reas-oning to the level of types, i.e., we are at this point bypassing the elim-restriction.As a further consequence we also obtain a decision procedure on the inhabitance oftypes that is capable of providing a witness when the type is inhabited (effectivelyanother form of choice principle). Both results are similar in flavour to the basicXM:

Definition IXM : Type B forall P : Prop, P + ¬P.Definition DIT : Type B forall T : Type, T + (T→ ⊥ ).

In the context of Coq’s type theory, this is a serious increase of our reasoningstrength and we are now going to prove these claims.

Theorem 1 (Informative Excluded Middle). IXM is inhabited.

Proof. Given an arbitrary proposition P we want to establish, that P+¬P. We knowthat we are classical so we would like to perform a case split on whether P holdsor not. We note, however, that we are in the process of showing the inhabitance ofa (non-propositional) type, while excluded middle is a proposition and due to theelim-restriction we cannot (yet) perform the case split.

In this special case we can get around the restriction by first proving the pro-position inhabited (P + ¬P). If i is a proof of this proposition then the degenerateε-construction ε i (λ . >) is a member of the type P + ¬P.

So we are left with the goal inhabited (P + ¬P). This now is a proposition andCoq allows us to perform the case split on whether P holds or not. In either caseit is straightforward to obtain an element in P + ¬P, which in turn allows us toconclude the desired goal. �

Theorem 2 (Decidability of the Inhabitance of Types). DIT is inhabited.

13


Proof. Given an arbitrary type T , we wish to establish that T +(T → ⊥). As beforewe are constructing a member of a type, but now we can exploit Theorem 1 to doa case split on whether the proposition inhabited T holds.

If it does hold, i.e., we have i : inhabited T , then ε i (λ . >) : T and thereforeT + (T → ⊥).

Otherwise it is easy to establish T → ⊥. We simply assume T , which givesus inhabited T and thus a contradiction. The desired result again follows triviallyfrom T → ⊥. �

We are now able to translate mathematical definitions of the form

A B{

B if PC otherwise

into direct Coq code like so

Definition A : T B match (ixm P) with| inl H⇒ B| inr H⇒ Cend.

Here ixm is the name of the formal proof of IXM and we have assumed thatP : Prop. If we have to base the decision on the inhabitance of some T : Type (andpossibly also need the explicit witness) we would use dit T instead. Effectively wenow have an “if-then-else”-construction.

14

CHAPTER 3

Axiomatisation of Tarski-Grothendieck

We are formally presenting Tarski-Grothendieck set theory, i.e., ZFC with Grothen-dieck universes1, as a collection of axioms. We present these axioms here and willelaborate the role of the nontrivial ones.

Before we take a look at each of the axioms and basic operations let us brieflyreflect on why an axiomatisation is necessary. Cantor originally proposed a notionof set theory where it was possible to take an arbitrary predicate and form the set ofall things satisfying it [Can74]. Russell discovered an inherent contradiction in thisapproach [Rus02] which made the theory unusable as a foundation of mathemat-ics2. The problem was that Cantor’s theory permitted the formation of sets whichwere in some sense “too large”. The solution to this problem was to construct setsfrom a number of ground facts, or axioms, which were (mostly) considered non-controversial and which avoided the introduction of paradoxical sets like the onepresented by Russell into the theory. The first such axiomatisation was given byZermelo in 1908 [Zer08].

We want to formally embed this setup into the type theory of Coq. To achievethis we will pose a type set and a membership relation ∈. At this point we can ofcourse construct predicates of the form P : set → Prop, which we are going tocall classes. Then for a given set Y and a given class P, the statement “Y encodesP” corresponds formally to

∀z : set, z ∈ Y ↔ P z.1Technically we are working with ZFC + Grothendieck universes since we have a choice axiom

in our metatheory prior to the set theory axioms2To be precise, the contradiction appeared in the logic of Frege’s Begriffsschrift [Fre79] but it

carried over to set theory.

15

Chapter 3. Axiomatisation of Tarski-Grothendieck

When we define a new Y satisfying a statement of this form it is common touse the set builder notation Y B {z | P z}, known as unrestricted comprehension.The important thing for now to remember is that a Y defined in such a way mayor may not be a set, which brings us back to the point that some classes are “toolarge” to be encoded by sets. Our axiomatisation does not include this unrestrictedcomprehension so we do avoid these problems. Since this way of defining new setsis rather natural though, and since it yields well formed sets in many scenarios, wewill later derive a restricted form of comprehension, known as separation. Wepostpone further discussion of these issues to the corresponding section.

Before we take a look at the actual axioms, we have to take a small technicaldetour. In Coq, there are a number of instances where several keywords encodethe same underlying concepts, e.g., Coq does not distinguish between Lemma,Theorem or Corollary. The different keywords are there solely for structuring thepresentation of results. Another instance of this phenomenon are the keywordsParameter and Axiom. They are technically identical and both introduce a newterm at a given type into the general context of assumptions. In the following, weuse Axiom if the type lives in Prop and Parameter otherwise.

3.1 Sets and Membership

As mentioned above we will need a type for our sets and a corresponding mem-bership relation, so we start with these. First we augment our type theory with therequired type.

Parameter set : Type.

The next thing we require is the notion of sets being elements of other sets, givenas the binary relation ∈ on set. The existence of such a relation is also assumed:

Parameter In : set → set→ Prop.

Here In x X is understood to state that “the set x is an element in (or a member of)the set X”.

For convenience we are going to use the relation in its more common infix nota-tion, i.e., we are going to write x ∈ X instead of In x X and we will use x < X forthe negation of the statement. We will use the word “containment” with respect tothis relation, i.e., when we say “X contains x” we formally mean x ∈ X.

We will often work with properties of sets, which in Coq are given as predicatesof the form p : set → Prop. These predicates are technically well defined on thecomplete type set, but we often only want to look at a particular set or domainX and it is sufficient to establish that the predicates hold on all elements of thatdomain (and we do not care what happens elsewhere). So instead of the proposition

∀x : set, p x

16

3.2. Inclusion

we also use the following, where we add the assumption of being inside the desireddomain:

∀x : set, x ∈ X → p x,

or, expressed in a more convenient notation we will adopt in the following:

∀x ∈ X, p x.

Now the truth of this proposition only depends on how the predicate p acts on thedomain X. Accordingly we observe that for any x < X the body of the statement,namely x ∈ X → p x, is vacuously true.

3.2 Inclusion

It is now easy to define another very common concept from set theory, namely thebinary relation subset or set inclusion. As with membership we prefer the infixform X ⊆ Y over the prefix notation Subq X Y. It expresses that “the set X is asubset of (or included in) the set Y”, which can easily be stated in terms of themembership relation:

X ⊆ Y B ∀x ∈ X, x ∈ Y.

Note that the statement “X includes Y”, i.e., formally Y ⊆ X, is different fromthe statement “X contains Y”. In Coq we have the following definition:

Definition Subq : set → set→ Prop Bλ X Y . forall x:set, x ∈ X→ x ∈ Y.

The following two facts should be clear from the definition.

Lemma 1. ⊆ is reflexive, formally

∀A : set, A ⊆ A.

Proof. Immediate from the definition. �

Lemma 2. ⊆ is transitive, formally

∀A B C : set, A ⊆ B→ B ⊆ C → A ⊆ C.

Proof. Immediate from the definition. �

From Lemma 1 and Lemma 2 it follows that ⊆ is a preorder on set. In fact wewould like to have a partial order so we would additionally require antisymmetry:

X ⊆ Y → Y ⊆ X → X = Y. (3.1)

This, however, is a statement about the equality of sets, a notion that we shouldfirst make precise before continuing.

17


3.3 Extensionality

With any kind of theory it is useful to know when exactly two things are equal.Coq provides us with an inbuilt notion of equality, =, that has all the properties onewould expect from such a relation: it is an equivalence relation and we can use itto substitute equals for equals.

With respect to our sets we want something more, namely that two sets are equalexactly when they contain the same elements. In other words, our sets should beextensional. With the following axiom we augment Coq’s equality on the type setwith the capability to reason about elements.

Since we defined ⊆ in terms of ∈ we simply axiomatise the antisymmetry of⊆ (3.1), effectively turning ⊆ into a partial order on the type set. The inverseimplication X = Y → (∀x, x ∈ X ↔ x ∈ Y) follows trivially from the substitutionproperty of =.

Axiom 1 (Extensionality). Two sets X and Y are equal if they contain the sameelements, formally

(∀x, x ∈ X ↔ x ∈ Y)→ X = Y.

In the formalisation we have

Axiom set ext : forall X Y : set, X ⊆ Y→ Y ⊆ X→ X = Y.

It is interesting to compare the approach chosen here with the work of Barras[Bar10]. There we find an abstract equivalence relation instead of Coq’s inductiveequality. On the one hand this allows a greater number of possible models while onthe other it leads to a different form of the axiom of replacement (which we will seelater). On its own this alternate form of replacement is somewhat more complicatedthan our version, but it allows a very easy definition for the separation operation.For us, this is where a majority of the work has to be done. The price for workingwith an abstract equivalence is that compatibility of all the set constructions hasto be established explicitly. With Coq’s inbuilt equality this follows immediatelyfrom the definition of =.

Additionally, a set theory axiomatisation with an abstract equivalence can insome cases actually be implemented or modelled in Coq, (see Aczel’s sets-as-treesinterpretation [Acz78] and [Wer97] for such constructions). Since the addition ofGrothendieck universes later will in our case prevent such an endeavour we decidedto simply use =.

18

3.4. ∈-Induction & Regularity

3.4 ∈-Induction & Regularity

Sets are formed iteratively, so we can prove properties about all sets by inductionon membership. Suppose P is a property of sets. If we can prove P holds for Xfrom the inductive hypothesis that P holds for all members of X, then P must holdfor all sets.

Axiom 2 (∈-induction). The membership relation on sets satisfies the inductionprinciple

∀P : set→ Prop, (∀X : set, (∀x ∈ X, P x)→ P X)→ ∀X : set, P X.

In the formal development we have the following

Axiom In ind : forall P : set → Prop,( forall X : set, ( forall x, x ∈ X→ P x)→ P X)→forall X : set, P X.

The quantification over the predicate, or class, P places this statement firmly inthe domain of higher-order logic. In a first-order presentation this would be anaxiom schema, i.e., one axiom for each predicate P. In a first-order setting it ismore common, though, to use the logically equivalent axiom of regularity, namely

∀A : set, (∃B : set, B ∈ A)→∃B : set, B ∈ A ∧ (¬∃C : set, C ∈ A ∧C ∈ B),

which was presented by Zermelo as the axiom of foundation [Zer30]. It states thatevery non-empty set A contains an element B disjoint from A.

In our higher-order metatheory the induction principle shown above is the morepractical choice. Independently of the concrete form of the axiom we obtain theconsequence that no set is a member of itself and more generally the fact that themembership relation (∈) is well-founded, explaining the original name.

As it turns out, this axiom is not required for most of the material presentedhere, and neither was it part of Zermelo’s original theory [Zer08]. Only when weget to the construction of ordinals in the very end do we make use of the fact thattwo-cycles (two sets M and N containing each other) are prevented by this axiom.

3.5 The Empty Set

So far we have seen set membership, set inclusion, an induction principle and anotion of equality on sets, but up to this point we do not even know if our type setis inhabited, i.e., we have not yet claimed the existence of any actual sets. We willnow rectify this shortcoming by posing the existence of one fundamental set.

19


Axiom 3 (The empty set). There exists a set which does not contain any elements.We call this set the empty set and denote it by ∅. Formally we have

∀x : set, x < ∅.

On the Coq level this is reflected with two statements. The first introduces a newterm of type set and the second clearly pins down the behaviour of the new term.

Parameter Empty : set.Axiom EmptyE : forall x : set, x < Empty.

Now that we know of the existence of one set, all the previous definitions andaxioms start to become useful. In the following we will pose a number of opera-tions that build new sets from existing sets. This allows us to populate our type setfrom the ground up.

There is one aspect of the way the axiom above is presented that should give thecareful reader pause. We first claim that some set with a given property exists andthen immediately proceeded to talk about the set. Why is this ok? Especially sincethe informal, natural language usage of the notion of set allows us to express a setwith no elements in many different ways? Take the set of all €-banknotes with avalue strictly greater than 5 and strictly less than 10. Another example is the set ofall unicorns.

Traditionally, axioms like this were stated in first-order logic, and we could havetaken the same approach (in fact, even the axiom above starts with an existentialquantification). It is, however, much easier to talk about and read the definitionsand axioms when we assign a clear symbolic representation to the new construct.The justification of this relies on two principles we have already introduced. Thetype-theoretic choice principle allows us to extract an actual witness from an exist-ential quantification, while the axiom of extensionality then ensures the uniquenessof this witness. Why is that the case? Well, if we look at the actual definition, weobserve that it has the form ∀x, x ∈ Z ↔ P x for a particular P that characterisesZ. For the empty set, P is simply the constant-⊥ predicate so we only requiredthe forward implication, the other holds vacuously. The interesting part is thatthis characteristic property precisely describes the elements which the newly intro-duced set Z does contain. Hence all possible Z satisfying this definition containexactly the same elements, which by extensionality implies that they are in fact thesame unique set. This procedure is sometimes referred to as Skolemisation, andthe new parameters are called Skolem terms or Skolem constants.

Throughout the remainder of this work we will present most axioms and derivedconstructs in skolemised form. That is we first pose the existence of a new set-former (n-ary, type-theoretic function on type set) as a Parameter and then statea number of Axioms, which together reflect its characteristic, or defining property.Since it is easier to work with in the formal development, we prefer to decomposeequivalences into implications.

20

3.6. Unordered Pairs

When working with sets it is often useful to know whether a set is inhabited ornot, in much the same way we discussed with respect to types before. The easiestway to do this is to exploit the membership relation. We define the followingpredicate to express this.

Definition inhset B λ S: set . exists w, w ∈ S.

As an example, consider ¬ inhset ∅. This holds since assuming inhset ∅ imme-diately contradicts the axiom of the empty set. This is no surprise as the claimeffectively restates the axiom with (∀¬) replaced by the equivalent (¬∃). The ex-ample simply serves to illustrate our new definition. We also have the followingclassical result:

∀A : set, A = ∅ ∨ inhset A. (3.2)

Many of the results in the remainder depend on whether some of the sets in-volved are empty or not and, in the latter case, usually on witness of the inhab-itance. The property above tells us that we can safely perform case splits of thisform to establish the desired results. It is the most common form of classical reas-oning we will see in the context of sets. We have, of course, used Coq’s classicalreasoning facilities to establish (3.2) in the first place.

3.6 Unordered Pairs

We are going to need more than just one set and the intuitive next step is to put twothings together. So assume we are given two sets y and z, then we can construct anew set which we call an unordered pair.

Axiom 4 (Pairing). For all sets y and z there exists a set containing exactly y andz as elements. We call this set the unordered pair of y and z and denote it by {y, z}.Formally we have

∀x : set, x ∈ {y, z} ↔ x = y ∨ x = z,

which is coded in Coq as

Parameter UPair : set→ set→ set.Axiom UPairI1 : forall y z : set, y ∈ (UPair y z).Axiom UPairI2 : forall y z : set, z ∈ (UPair y z).Axiom UPairE : forall x y z : set, x ∈ (UPair y z) → x = y ∨ x = z.

Note that the axioms on the Coq level can be understood, as the terminologysuggests, as introduction and elimination rules with respect to membership in thenewly defined set {y, z} and we often use the axioms in this fashion. In this case wehave also simplified the introduction rules with respect to the substitution propertyof =, i.e., we used that x = y→ x ∈ {y, z} is equivalent to plain y ∈ {y, z}.

Now that we are able to produce sets which actually do contain elements it isa good time to verify that our unordered pairs are, as the name suggests, actuallyunordered.

21


Theorem 3. The axiomatic pairing of sets a and b is agnostic with respect to theirordering, formally

{a, b} = {b, a}.

Proof. Using set extensionality it is sufficient to show that {a, b} ⊆ {b, a}, as theother direction is symmetric. Taking an element x ∈ {a, b} and employing UPairEwe know that either x = a or x = b. In the first case, we have x ∈ {b, a} dueto UPairI2, while in the second the same result is obtained with UPairI1, thuscompleting the proof. �

3.7 Unions of Sets

So far we can produce sets which have exactly zero, one3 or two elements, regard-less of how we twist the machinery already defined. This is not good enough andwe would like to combine or merge smaller sets to obtain bigger sets of arbitrarysize. This is achieved by forming the union of sets. The reader will likely be fa-miliar with the binary union operator. Theoretically we could use it to express theunion of any finite number of sets and with the help of the other axioms we couldeven deal with some infinite unions but this is unnecessarily complicated. Insteadof providing a union over two sets we will axiomatise an operator that performsa union over an arbitrary collection of sets. To some degree this can be seen as amore uniform or generic way to deal with this kind of construction.

A collection is, just like everything else, a set whose elements are sets them-selves. The terminology is meant to highlight that the union operation mainly af-fects the elements of the argument and more importantly their respective elements.With this in mind we pose the following.

Axiom 5 (Union). Given a collection of sets X, there exists a set whose elementsare exactly those which are a member of at least one of the sets in the collection X.We call this set the union over X and denote it by

∪X. Formally we have

∀x : set, x ∈∪

X ↔ ∃Y ∈ X, x ∈ Y,


Parameter Union : set→ set.Axiom UnionI : forall X x Y : set, x ∈ Y→ Y ∈ X→ x ∈ (Union X).Axiom UnionE : forall X x : set, x ∈ (Union X)→

exists Y : set, x ∈ Y ∧ Y ∈ X.

It should be clear that the binary union falls out as a special case but we willdefer the formal construction until we have seen all the axioms.

3An unordered pair of the form {a, a}, more on that in Section 4.1.

22

3.8. Powerset

3.8 Powerset

We are able to express that some set is a subset of another and all the sets we haveseen so far have at least two such subsets: ∅ and the set itself; ∅ is of course anexception which is itself its sole subset. It would be very useful if we could formthe set of all subsets of a set. This is known as forming (or taking) the powerset ofa set and we postulate the existence of this operation.

Axiom 6 (Powerset). Given a set X, there exists a set which contains as its elementsexactly those sets which are the subsets of X. We call this set the powerset of X anddenote it by P(X). Formally we have

∀Y : set, Y ∈P(X)↔ Y ⊆ X,


Parameter Power : set→ set.Axiom PowerI : forall X Y : set, Y ⊆ X→ Y ∈ (Power X).Axiom PowerE : forall X Y : set, Y ∈ (Power X)→ Y ⊆ X.

This axiom is special in that it produces “large” sets at a much higher rate thanwhat we have seen so far. If we take a set S with N elements, than P(S ) will have2N elements, i.e., the sets grow exponentially. Later we are going to provide setswith certain closure properties and to comprehend the “size” of these it helps tokeep this growth rate in mind.

3.9 Replacement

We are working in the context of type theory where higher-order functions are notuncommon. One such function is map which lifts functions on particular objectsto lists of such objects, i.e., given a function F that can transform objects of onetype to other objects of a possibly different type we can substitute every element ina list by its image under F. Can we do something similar with sets?

We consider mapping a function F over a set X, i.e., we are going to apply F toevery element of X. Since the elements of X have type set we know that we haveto restrict the argument type of F to set. Also, if the collection of things we obtainfrom F should form a set, the result type should be set as well. In conclusion weare looking for an F : set → set, i.e., a unary set former. We now pose that whatwe get back after this mapping operation is still a set.

Axiom 7 (Replacement). Given a unary set former F and a set X, there exists aset which contains exactly those elements obtained by applying F to each elementin X. We denote this construction with {F x | x ∈ X}. Formally we have

∀y : set, y ∈ {F x | x ∈ X} ↔ ∃x ∈ X, y = F x,


23


Parameter Repl : (set→ set) → set→ set.Axiom ReplI : forall X : set, forall F : set → set, forall x : set,

x ∈ X→ (F x) ∈ (Repl F X).Axiom ReplE : forall X : set, forall F : set → set, forall y : set,

y ∈ (Repl F X)→ exists x : set, x ∈ X ∧ y = F x.

This axiom was a later addition by Fraenkel in 1922 [Fra22]. He noticed an upto that point undiscovered gap in Zermelo’s foundational set theory that could beclosed with the addition of a new axiom4. In short, certain large but well-behavedsets could not be formed in Zermelo’s axiomatisation. In place of the higher-orderfunction argument in our type-theoretic approach the original text employs therather vague notion of replacing an element of a set with another thing from thebasic domain5. One should also mention that the same issue was discovered andrepaired, also in 1922, by Skolem [Sko22] who later established that his solutionwas equivalent to Fraenkel’s (pp. 7 - 9, [Sko29]).

It is interesting to compare our replacement axiom with the one found in [Bar10].In our case we perform the substitution with a (total) function of type set → setwhile Barras takes a functional relation of type set→ set→ Prop. The importantthing here is that his relations are not required to be total, thus admitting a trivialdefinition of separation. On the downside, Barras has to deal with the fact that theprovided relation may not be a morphism with respect to his abstract equality. Thisleads to extra side conditions in his replacement axiom.

As it turns out this new axiom, in combination with the axiom of the empty set,implies and hence supersedes Zermelo’s axiom of separation. Based on this we donot include the latter in our axiomatisation and instead present the separation oversets as a formally derived construction in Section 4.4.6

3.10 Grothendieck Universes

We have finally arrived at the point where we are departing considerably fromordinary ZFC. We are postulating the existence of a sequence of sets with certainclosure properties. Even the smallest (nontrivial) of these will be considerablybigger than anything we have seen so far. Due to this and the role they usually playin formal developments we are going to call them universes.

The definition of these universes relies on the notion of a transitive set, whichwe need to make precise. We have seen a transitive relation before, namely ⊆, but

4Original: “Diese bisher nicht bemerkte Lucke in der Zermeloschen Begrundung ist durch Hin-zufugen eines neuen Axioms [. . . ] auszufullen” (p. 231, [Fra22])

5Original: “. . . und wird jedes Element von M durch ein ‘Ding des Bereichs B’ [. . . ] ersetzt, . . . ”(p. 231, [Fra22])

6There seems to be no canonical source for this result and the evident redundancy of having bothreplacement and separation as axioms is considered folklore, at least according to private commu-nication with Prof. Martin Hyland, DPMMS, Cambridge.

24

3.10. Grothendieck Universes

despite the same name and a similar logical structure of the definition, a transit-ive set should not be confused with a transitive relation. We say that a set U istransitive, when the following property holds:

∀x X : set, x ∈ X → X ∈ U → x ∈ U.

Now a Grothendieck universe is a transitive set which is closed under the setoperations of unordered pairing, power set, union and replacement.7 We will nowpostulate the existence of a sequence of such universes.

Axiom 8 (Grothendieck Universes). For every set X there exists a least transitiveset containing X which is closed under the basic set operations, namely pairing,power set, union, and replacement. We denote these universes by GUX . In Coq weencode this as

Parameter GU : set→ set.Axiom GUIn : forall N : set, N ∈ (GU N).Axiom GUTrans : forall N X Y : set,

X ∈ (GU N)→ Y ∈ X→ Y ∈ (GU N).

The following Coq statements establish the closure properties of GU.

Axiom GUUPair : forall N X Y : set,X ∈ (GU N)→ Y ∈ (GU N)→ (UPair X Y) ∈ (GU N).

Axiom GUPower : forall N X : set,X ∈ (GU N)→ (Power X) ∈ (GU N).

Axiom GUUnion : forall N X : set,X ∈ (GU N)→ (Union X) ∈ (GU N).

Axiom GURepl : forall N X : set, forall F : set → set, X ∈ (GU N)→( forall x : set, x ∈ X→ (F x) ∈ (GU N))→ (Repl F X) ∈ (GU N).

Clearly, GUX is a Grothendieck universe by construction. To be able to talk aboutthe Grothendieck universe containing X we are also going to enforce minimalitywith respect to any other Grothendieck universe U containing X.

Axiom GUMin : forall N U : set, N ∈ U→ ( forall X Y : set, X ∈ U→ Y ∈ X→ Y ∈ U)→ ( forall X Y : set, X ∈ U→ Y ∈ U→ (UPair X Y) ∈ U)→ ( forall X : set, X ∈ U→ (Power X) ∈ U)→ ( forall X : set, X ∈ U→ (Union X) ∈ U)→ ( forall X : set, forall F : set → set, X ∈ U→

( forall x : set, x ∈ X→ (F x) ∈ U)→ (Repl F X) ∈ U)→ (GU N) ⊆ U.

7The definition can equivalently be given in terms of closure under the union over a family ofindexed sets, in place of union and replacement. In our formalisation this alternative set operation isa derived construction we will see in Section 4.3.

25


It is interesting to take a brief detour and look at the history of this axiom andhow we arrived at the present formulation of Tarski-Grothendieck set theory. Tarskiobserved that ZFC cannot prove the existence of strongly inaccessible cardinals(§2, [Tar38]). He assumed the existence to be undecidable and it is now knownto be independent of ZFC. He rectified this shortcoming by adding his Axiom A.Apart from being able to construct inaccessible cardinals, he observes that the newaxiom allows to derive several of the original axioms, most notably the axiom ofchoice.

Motivated by the simplification of his work in algebraic geometry and cat-egory theory Grothendieck defined the notion of universes we have seen above[Gab62, Kru65, GV72]. It turned out that these Grothendieck universes are re-lated to the existence of inaccessible cardinals [Wil69], to be precise, “the set ofordinals of a Grothendieck universe is an inaccessible cardinal” (p. 22, [Bar12])8.In fact we can use the formulation with Grothendieck universes to obtain a settheory that is equivalent to Tarski’ ZFC+A, although without the need to havean axiomatised notion of cardinal numbers. This is the approach that Barras haschoseen [Bar10, Bar12] and that we follow as well. For comparison, the variantwith Tarski’s Axiom A is the formalism used in the Mizar project [Try90].

One may ask, why we include such a powerful construction into our formalisa-tion, apart from the fact that Barras is following the same course of action. Thecrucial aspect is the correspondence to inaccessible cardinals. In his work, Wernerdemonstrates that the minimally required number of inaccessible cardinals and thenumber of type universes interleave when modelling ZFC in CiC and vice versa[Wer97]. With the definitions above, primarily with GUIn, we have introduced aninfinite number of Grothendieck universes into our theory which should be morethan sufficient to interpret the full, infinite type hierarchy of CiC.

3.11 Infinity

Normally, ZFC also comes with an axiom of infinity, i.e., there exists a set withinfinitely many elements. In our case, this is a corollary. We can explicitly nameone infinite set: GU∅.

Corollary 1 (Infinity). There exists an infinite set, namely GU∅.

Proof. We will show that our candidate is infinite by exhibiting a function on itwhich is injective but not surjective. We have already seen the function on setswhich we will use here: P(X), the powerset construction.

Let us first check that we really have a function on GU∅. Take any A ∈ GU∅, thenP(A) ∈ GU∅ by GUPower. Next we have to establish that P(X) is not surjective,i.e., we have to provide an element in GU∅ that is never produced by the powerset

8The correspondence given here relies on the presence of the axiom of choice, which is satisfiedin our setting.

26

3.12. Choice

..

T

.

X1

.

Xn

.

S 1

.

D(X1)

.

D(Xn)

Figure 3.1: Illustration of Zermelo’s axiom of choice

construction. Take ∅ where we have ∅ ∈ GU∅ from GUIn. Now it should be clearthat ∅ ⊆ A for all sets A and hence, by PowerI, ∅ ∈ P(A). This tells us that wemust have P(A) , ∅ for all A, since otherwise we would obtain the contradiction∅ ∈ ∅. So we have established that the powerset construction is not surjective onGU∅, but what about injectivity?

Assume P(A) = P(B), then we have to show A = B, which can be reducedto A ⊆ B ∧ B ⊆ A. We are going to show the left conjunct, the other half issymmetric. We clearly have A ⊆ A, and therefore we obtain A ∈ P(A) usingPowerI. Our assumption now tells us that we also have A ∈P(B), from which wecan infer the desired result with PowerE. This establishes the injectivity of P(X)and completes the proof. �

3.12 Choice

We remarked before that we are axiomatising ZFC together with Grothendieckuniverses but so far we are missing a set-theoretic choice principle. In its originalform this axiom was stated as follows:

For every collection T of non-empty, disjoint sets, there exists at least one subsetof S 1 ⊆

∪T such that each set in T has exactly one element in common with S 1.

This can alternatively be seen as the ability to choose one particular element froma number of sets and combine these choices into a single set.9

9Original: “Ist T eine Menge, deren samtliche Elemente von 0 verschiedene Mengen und un-tereinander elementarfremd sind, so enthalt ihre Vereinigung ST mindestens eine Untermenge S 1,welche mit jedem Element von T ein und nur ein Element gemein hat. [. . . ] Man kann das Ax-iom auch so ausdrucken[,] daß man sagt, es sei immer moglich, aus jedem Element M,N,R, · · ·von T ein einzelnes Element m, n, r, · · · auszuwahlen und alle diese Elemente zu einer Menge S 1 zu

27


We do already posses the choice principle introduced in Section 2.4 and it isstrong enough to give us a choice principle or “axiom” on the level of set theory.It is in general easier, though, to work with the type-theoretic choice principle inthe formalisation so the material presented in this section is neither part of theaxiomatisation itself nor is it actually formalised. Regardless of these aspects it isstill instructive to see how the two notions of choice relate.

Figure 3.1 illustrates Zermelo’s form of the axiom. We have taken the liberty tomark the chosen “representative” for each Xk as the result of a metafunction D thatis capable of selecting default elements for arbitrary, non-empty sets. The existenceof such a D is what lies at the heart of the axiom and, as we will see shortly, arisesfrom our type-theoretic choice principle.

We have introduced the empty set, ∅, above which guarantees that our type setis inhabited, i.e., we have ∅ : set ` inhabits ∅ : inhabited set. With this we candefine an ε-operator at the type set, namely εset B ε (inhabits ∅).

This choice operator at type set now enables us to define the metafunction D :set→ set as follows:

D(X) B εset(λx : set. x ∈ X). (3.3)

Additionally, whenever we have inhset X, ε spec allows us to conclude D(X) ∈X. Zermelo’s choice set S 1 can then be defined with a replacement operation overT :

S 1 B {D(X) | X ∈ T }.

We, of course, have to verify that we actually constructed the set requested by theoriginal axiom. Since the following result is only established on paper, we take theliberty to write X∩S 1 = {x} as a short hand notation for ∀y, y ∈ X∧y ∈ S 1 ↔ y = x.

Theorem 4. For any collection T of non-empty, disjoint sets, the set S 1 B{D(X) | X ∈ T }, where D is defined in (3.3), satisfies Zermelo’s properties of thechoice set, namely

1. S 1 ⊆∪

T,

2. ∀X ∈ T, ∃x, X ∩ S 1 = {x}.

Proof. We show that S 1 B {D(X) | X ∈ T } has the required properties to be con-sidered a choice set with respect to Zermelo’s original formulation.

1. Assume an arbitrary z ∈ S 1, then ReplE tells us that there is an X ∈ Tsatisfying D(X) = z. Since X is non-empty we have D(X) ∈ X and thereforez ∈ X. UnionI now allows us to conclude z ∈ ∪T , as required.

vereinigen.” (p. 266, [Zer08])

28

3.12. Choice

2. Here the existential witness can only be D(X) reducing our goal to X ∩ S 1 =

{D(X)}. We show inclusion both ways. First, it is easy to establish thatD(X) ∈ X since X is non-empty and D(X) ∈ S 1 only relies on the given factthat X ∈ T , thus establishing {D(X)} ⊆ X ∩ S 1.

The other direction is slightly more involved. We assume some x ∈ X ∩ S 1,i.e., x ∈ X and x ∈ S 1 and have to establish x = D(X). Since x is a memberof the choice set we know that there has to be a Y ∈ T satisfying x = D(Y).We now establish X = Y by contradiction. Assume X , Y and also note thatsince Y is non-empty we have D(Y) ∈ Y and therefore x ∈ Y . This, however,contradicts X and Y being disjoint, hence X = Y and x = D(X). We haveestablished X ∩ S 1 ⊆ {D(X)} which completes the proof of the equality.

�

29

CHAPTER 4

Development of the TG set theory

There are a number of derived constructions that show up repeatedly or are ingeneral considered to be an integral part of a set theory and hence deserve theirown name and/or notation. Providing these and establishing that their associatedproperties hold will get us from a basic axiomatisation to a theory that is actuallyuseful. The first few are fairly simple and are basically just short hand notationsfor useful sets, while the later constructions will become quite involved.

4.1 Singleton and Basic Sets

First of all we will look at singleton sets, i.e., sets containing exactly one element.When we introduced unordered pairs we never considered the scenario where thetwo components of the pair are equal. Any set theory requires that sets do not con-tain duplicates, hence informally {x, x} and {x} should denote the same singleton setcontaining exactly x. This basically motivates the following definition of singletonsets:

{x} B {x, x},or respectively in Coq

Definition Sing : set → set B λ X : set . UPair X X.

The defining property of singleton sets is

∀x : set, x ∈ {y} ↔ x = y.

In the formal development we have these two corresponding lemmas:

31

Chapter 4. Development of the TG set theory

Lemma SingI : forall X, X ∈ (Sing X).Lemma SingE : forall X Y, Y ∈ (Sing X)→ Y = X.

They are easily derivable as special instances of the respective rules for un-ordered pairs. It should be clear that it is equally straightforward to ascertain theclosure of GUN under singleton formation.

Apart from ∅ there are two other sets that will later play a special role in the the-ory and that we can now define, namely {∅} and {∅, {∅}}. The reader may recognisethese as the common encoding of the first and second ordinal number (with ∅ beingthe zeroth) and we will accordingly refer to them as 1 and 2, respectively. We post-pone a general discussion of ordinals to the point where we construct the naturalnumbers. We observe that ∅ ∈ 1 ∈ 2 (also ∅ ∈ 2) as well as ∅ ⊆ 1 ⊆ 2. In fact itis no coincidence but a carefully balanced design decision to have x ∈ y ↔ x ( yfor x, y ∈ {∅, 1, 2}. The reason for introducing them here is that 2 can be viewedas a type for propositions with ∅ and 1 representing the false and true proposition,respectively. We will come back to this idea when we deal with functions andfunction spaces. First however it is instructive to see how the various axioms andconstructions behave with respect to these particular sets.

Let us look at inhabitance first. Here we quickly obtain inhset 1 and inhset 2. Toprove these we can use ∅ as the required witness in both cases.

When we consider set inclusion it is useful to note that the subset of a singletonset is either ∅ or the singleton set itself, and in particular

A ⊆ 1→ A = ∅ ∨ A = 1. (4.1)

It is worth pointing out that these results are classical, i.e., we crucially rely onthe law of the excluded middle to prove this.

Meanwhile, with respect to 2, two somewhat related results are useful as well.Assume we know S ∈ 2 for some S . Then we can use an additional piece ofinformation to establish that S = 1 or equivalently ∅ ∈ S , depending on the cir-cumstances. The two formal lemmas are

∀E O : set, E ∈ O→ O ∈ 2→ E = ∅ (4.2)

and∀S ∈ 2, inhset S → S = 1.

The somewhat asymmetric shapes of the two statements are due to the way theyare usually employed. The first is mostly used in forward reasoning to augment anexisting context with additional information, while the second is commonly usedin a backwards fashion and attempts to produce the simplest possible subgoals formost scenarios.

What happens when we take the union over one of our predefined sets, namely,what are the sets

∪ ∅, ∪ 1, and∪

2? As it turns out ∅, ∅ and 1 respectively but letus consider the cases carefully, one at a time.

32

4.1. Singleton and Basic Sets

Theorem 5. We have the following properties of the union operation with respectto the defined sets ∅, 1 and 2:

1.∪ ∅ = ∅

2.∪{X} = X

3.∪

1 = ∅

4. X ∈ 2 −→ ∪ X = ∅

5.∪

2 = 1

Proof. We prove the five results in order.

1. To show that some set is equal to the empty set it is sufficient to assume thatwe have a member of the set in question and then produce a contradiction.This technique is a one-sided form of set extensionality, the other half isalways satisfied since ∅ is included in every set.

Now we assume some x ∈ ∪ ∅, but then UnionE tells us that we have, forsome Y , x ∈ Y and moreover Y ∈ ∅. The latter is the desired contradiction.

2. We use set extensionality for this. First we show X ⊆ ∪{X}, i.e., for a givenx ∈ X we establish x ∈ ∪{X}. From SingI we know that X ∈ {X} and fromthis and x ∈ X we can derive the desired result with the help of UnionI.

For the other direction,∪{X} ⊆ X, we start with the elimination principle

for unions and obtain, for some Y and x, x ∈ Y and Y ∈ {X}. Elimination ofsingletons allows us to conclude Y = X from the latter, and therefore x ∈ Xas required.

3. This is simply a special case of the second property where we have instanti-ated X with ∅.

4. This fact is an immediate corollary of the definition of 2 and properties oneand three.

5. This last one is slightly more involved. When exactly is a set equal to theset 1? It has to satisfy two criteria. Firstly, it must contain ∅ and secondly itshould only contain ∅. The first is easy, as ∅ ∈ 1 and 1 ∈ 2, but what aboutthe second? Assume we have some arbitrary x ∈ ∪ 2, then from UnionE weobtain x ∈ Y and Y ∈ 2 for some Y . At this point we can exploit result (4.2),to ensure that x = ∅, as required.

�

With respect to the powerset we can state and prove the following two interestingresults.

33


Theorem 6. We have the following properties of the powerset operation with re-spect to the defined sets ∅, 1 and 2:

1. P(∅) = 1

2. P(1) = 2

Proof. We prove the results in order.

1. We follow the same scheme we used before to establish that a set is equalto 1. We noted before that ∅ is a subset of every set, and concretely ∅ ⊆ ∅,from which ∅ ∈ P(∅) follows with PowerI. Next we assume an arbitraryelement X ∈ P(∅) or, via PowerE, X ⊆ ∅. This of course implies X = ∅,and together the two results tell us that P(∅) = 1.

2. For this proof we use set extensionality. The easier of the two is 2 ⊆ P(1).2 contains the sets ∅ and 1 and for each it is easy to establish that they arecontained in P(1) using PowerI. For the first we again appeal to the fact that∅ is a subset of everything, while for the second we can exploit the reflexivityof ⊆.

The more interesting direction is P(1) ⊆ 2 since it only holds classically.We assume an X ∈ P(1) and hence X ⊆ 1. Intuitionistically, this is all wecan say, but classically we know that X = ∅ ∨ X = 1 (see result (4.1)) andfrom there the proof is trivial.

�

To complete this section we look at the axiom of replacement, where we haveanother interesting result. It is useful in several cases.

Theorem 7. Applying the replacement operation with an arbitrary metafunctionF on ∅ always yields ∅, formally

{F y | y ∈ ∅} = ∅.

Proof. If we assume an x ∈ {F y | y ∈ ∅} then ReplE tells us that we must have ay ∈ ∅ with F y = x, but the existence of such a y is a contradiction, which in turnmakes it impossible for the x to exist. Hence the result of the replacement is ∅. �

34

4.2. Binary Unions

UnionIa ∈ A

UPairI1A ∈ {A, B}

a ∈ A ∪ BUnionI

b ∈ BUPairI2

B ∈ {A, B}b ∈ A ∪ B

Figure 4.1: Derivations of BinUnionI1 and BinUnionI2

4.2 Binary Unions

We indicated before that we can have the binary union of two sets A and B, writtenA ∪ B, as a derived construct in the present axiomatisation. We make this concretehere with the following definition:

A ∪ B B∪{A, B}.

In Coq we have

Definition BinUnion : set → set→ set B λ A . λ B . Union (UPair A B).

The defining property of the binary union is

∀x : set, x ∈ A ∪ B↔ x ∈ A ∨ x ∈ B,

which decomposes into a number of lemmas in the formalisation.

Lemma BinUnionI1 : forall A B a: set, a ∈ A→ a ∈ BinUnion A B.Lemma BinUnionI2 : forall A B b: set, b ∈ B→ b ∈ BinUnion A B.Lemma BinUnionE : forall A B x, x ∈ BinUnion A B→ x ∈ A ∨ x ∈ B.

The first two are basically pieced together from the corresponding introductionaxioms of the constituent components of the definition. The derivations are out-lined in Figure 4.1. For the third lemma we reason similarly but the consequenceof UnionE is an existentially quantified expression that we need to take apart form-ally. We consider an existing Y ∈ {A, B} such that x ∈ Y . Using UPairE on thefirst component yields Y = A ∨ Y = B, which can be combined with the secondcomponent to produce the desired goal.

The fact that our binary union is assembled from constructs under which ourGrothendieck universes are closed suggests that this property might also hold forthe new construct. Let us go ahead and prove this.

Lemma 3 (GU closed under Binary Union). Given any Grothendieck universe ofthe form GUN and a two sets A and B contained in this universe, then their binaryunion A ∪ B is also contained in GUN .

Proof. We know that GUN is closed under pairing, hence we have {A, B} ∈ GUN .From this and the closure property under taking the union we obtain

∪{A, B} ∈GUN , that is A ∪ B ∈ GUN as required. �

35


UnionI

ReplIx ∈ X

Fx ∈ {F x | x ∈ X} y ∈ Fx

y ∈∪x∈X

Fx

Figure 4.2: Derivation of FamUnionI

4.3 Union over a Family of Indexed Sets

We have seen the union over an arbitrary collection of sets. In many situations,however, these collections are not arbitrary but exhibit some underlying structure,i.e. all the elements of the collection are similar but depend on a particular in-dex, which is again a set taken from a so-called index set. We are looking at setexpressions of the form

Fx ∪ Fy ∪ Fz ∪ . . . ,where x, y, z . . . are elements in some set X. In this special case we call F a familyof sets indexed by X and formally write the union over this family as∪

x∈X

Fx =∪{Fx, Fy, Fz, . . .}.

The right hand side of this equation already suggests a way to define this usefulconstruct with the help of our existing axiomatic constructions. There is nothingspecial about the index set but for the family we will use a metafunction, that isan F : set → set, which takes the index to the corresponding member of thefamily. In this setup, the set {Fx, Fy, Fz, . . .} can be produced with the replacementoperation, mapping F over the index set X. Hence our formal definition is∪

x∈X

Fx B∪{F x | x ∈ X}.

We are going to stick with the subscript notation for the indexing but we shouldkeep in mind that under the hood we are working with function applications. Thecorresponding definition in the formalisation is

Definition FamUnion : set→ (set → set) → set B λ X F . Union (Repl F X).

Mathematically we expect the new construct to satisfy its defining property,namely

∀y : set, y ∈∪x∈X

Fx ↔ ∃x ∈ X, y ∈ Fx,

which is achieved by proving the following lemmas in Coq:

Lemma FamUnionI : forall X F x y, x ∈ X→ y ∈ (F x) → y ∈ (FamUnion X F).Lemma FamUnionE : forall X F y, y ∈ (FamUnion X F)→ exists x, x ∈ X ∧ y ∈ (F x).

36

4.3. Union over a Family of Indexed Sets

For the derivation of FamUnionI we refer to Figure 4.2. The elimination lemmahas an assumption which hides two existential quantifications, one for the unionoperation and one for replacement. The following proof sketch illustrates how wedeal with these.

We assume some y ∈ ∪x∈X Fx and hence y ∈ Y for some Y ∈ {F x | x ∈ X}. Thisin turn tells us that we have Y = Fx for some x ∈ X. From this it follows that thereis some x ∈ X such that y ∈ Fx, justifying the conclusion of the lemma.

As with the binary union we have only used components for which our universesare closed, again suggesting a corresponding closure lemma.

Lemma 4 (GU closed under Union over Families of Indexed Sets). Given anyGrothendieck universe of the form GUN , a set X contained in this universe and anindexed family F satisfying ∀x ∈ X, Fx ∈ GUN , then the union over the family,∪

x∈X Fx is also contained in GUN .

Proof. The proof goes as before using GURepl and GUUnion. The given premisesallow us to conclude {F x | x ∈ X} ∈ GUN from which

∪{F x | x ∈ X} ∈ GUN fol-lows. The latter is the desired result,

∪x∈X Fx ∈ GUN . �

For this slightly more involved construction we have a number of interestingresults which derive from our knowledge about the underlying components.

Theorem 8. Properties of the union over families of indexed sets.

1.∪

x∈∅ Fx = ∅

2. (∀x ∈ X, Fx ∈ 2) −→ (∃x ∈ X, Fx = 1) −→ ∪x∈X Fx = 1

3. inhset X −→ (∀x ∈ X, Fx = C) −→ ∪x∈X Fx = C

4. (∀x ∈ X, Fx = ∅) −→∪

x∈X Fx = ∅

5. (∀x ∈ X, Fx ∈ 2) −→ ∪x∈X Fx ∈ 2

Proof. We prove the five results in order.

1.∪

x∈∅ Fx =∪{Fx | x ∈ ∅} =

∪ ∅ = ∅. The second to last equality followsfrom Theorem 7, while the last is Theorem 5, Part 1.

2. Here we again have to establish that a particular set equals 1 so we employour tried and tested method for these cases. The second assumption tellsus that there is an x ∈ X such that Fx = 1, and therefore ∅ ∈ Fx. FromFamUnionI we conclude ∅ ∈ ∪x∈X Fx.

Now consider some arbitrary y ∈ ∪x∈X Fx, i.e., there is some x ∈ X such thaty ∈ Fx. For this particular x our first assumption tells us that Fx ∈ 2. Nowthe chain y ∈ Fx ∈ 2 is an instance of (4.2), which tells us that y = ∅ andcompletes the proof.

37


3. For this result itself we need set extensionality as C is arbitrary. First weassume some arbitrary y ∈ ∪x∈X Fx, i.e., we have y ∈ Fx for some x ∈ X.Our second premise now tells us that Fx = C and hence y ∈ C, giving∪

x∈X Fx ⊆ C. Conversely we have some y ∈ C and can use the first premiseto get hold of some arbitrary x ∈ X. For this x the second premise tells usthat Fx = C. Hence we can conclude y ∈ Fx. At this point, all we need to dois use FamUnionI to obtain y ∈ ∪x∈X Fx and hence C ⊆ ∪x∈X Fx.

4. We simply perform a case split on whether X is empty or inhabited. In thefirst case we have exactly Part 1 of this theorem, while in the second we havean instance of Part 3.

5. The premise tells us that all Fx are members of 2, i.e., they are all either ∅ or1. Now we split on whether there is at least one Fx = 1. If this is the casewe can use Part 2 to establish our goal. Otherwise we know that all Fx = ∅which gives us an instance Part 4, again proving our claim.

�

The significance of these results comes to light when we view ∅ as logical falsityand 1 as truth, as indicated earlier. In this light, Part 2 and Part 4 reflect existentialreasoning. The metafunction F plays the role of the predicate and the index setthat of the argument domain. An existential quantification is true if there exists awitness, which is expressed by Part 2, and it is false if the predicate fails on allarguments, reflected in Part 4.

Finally, Part 5 will be relevant when we get to the construction of functionspaces, where it plays a major role in establishing a desired closure property. Thisin turn will allow us to interpret the impredicative universe Prop correctly in ourfuture model construction. We will continue this investigation once we define func-tions later.

4.4 Separation

In the introduction and in Section 3.12 we mentioned that the naive approach to settheory, i.e., in the tradition of Cantor [Can74], leads to sets which are “too big”.In other words, there are classes P where the unrestricted comprehension {x | P x}does not qualify as a set.

The canonical example is the Russell class P B λz, z < z. If we assume unres-tricted comprehension we can form the set R B {x | P x} = {x | x < x}, i.e., the setof all sets that do not contain themselves [Rus02]. The contradiction now ariseswhen we ask whether R does contain itself, leading to the paradoxical

R ∈ R↔ R < R.

38

4.4. Separation

Another class that cannot be encode is the universal class, i.e., there is no set ofall sets. We will justify this statement later, once we have derived the correctnessof a restricted form of comprehension.

To motivate the construction of separation let us first consider how we would goabout deriving comprehension and where we get stuck. Assume we want to definea set Y in terms of a given defining predicate P with comprehension. That is, Yshould be defined such that it exactly contains those elements satisfying P, i.e.,mathematically

∀x : set, x ∈ Y ↔ P x.

When we abstract over Y we can define a new predicate Q, thus

QP B λZ : set. ∀x : set, x ∈ Z ↔ P x. (4.3)

At this point we recall our choice operator εset from Section 3.12 and suggestthe following definition:

Y B εset QP,

where QP only depends on the defining property P and εset QP is the comprehen-sion construction. Summing up, we have defined the set Y to be one of those setswhich satisfy the defining property. There may be none, so up to this point theconstruction is completely vacuous. What we are really after is QP(Y), i.e., the factthat our newly defined set really does satisfy its specification. The means to provethis is ε spec, which reduces the goal to

∃Z : set, QP(Z) = ∃Z : set, ∀x : set, x ∈ Z ↔ P x.

This however will not be provable for all predicates P as the Russell predicateillustrates. We observe, though, that for those cases where it is provable, the definedset is unique due to extensionality.

There is a way to make a fairly similar construction work for all P if we assumethe axioms of the empty set and the axiom of replacement. The idea is to strengthenthe comprehension predicate QP from (4.3) with a bounding set X as follows:

Q′P,X B λZ : set. ∀x : set, x ∈ Z ↔ x ∈ X ∧ P x.

When we write εset Q′P,X in set builder notation we get {x ∈ X | P x}. This iscommonly known either as restricted comprehension or as separation on sets andit allows us to generate a subset of X where all elements satisfy a given predicateP. In contrast to the unrestricted form, this is known to be well behaved and oftentaken as axiomatic (e.g., Axiom III in [Zer08]). In our case we are able to define theconstruction and follow the approach outlined above. This leads to the followingdefinition in the formalisation:

Definition Sep : set → (set → Prop)→ set B λ X P .ε ( inhabits Empty) (λ Z . forall x, x ∈ Z↔ x ∈ X ∧ P x).

39


We have claimed that this construction “is ok” and we will now make this state-ment concrete, i.e., we prove that every set build with this set former actually sat-isfies its specification.

Theorem 9 (Definition of Separation is correct). For all bounding sets X and forall predicates on sets P, the set Sep X P, mathematically {x ∈ X | P x}, is exactly thesubset of X where all elements satisfy P, formally

∀x : set, x ∈ {x ∈ X | P x} ↔ x ∈ X ∧ P x.

Proof. We have used εset to define {x ∈ X | P x} and following the argument aboveit is sufficient to show

∃Z, ∀x, x ∈ Z ↔ x ∈ X ∧ P x.

We will explicitly construct Z by mapping a function F over X that will act likethe identity on those x which do satisfy the property and replace all others by aparticular default element xdef for which the predicate P is known to hold. Thisalready highlights the first problem, namely the case where no such xdef exists. Wedeal with this corner case first.

Case 1: ¬∃x ∈ X, P x. It is clear that in this case ∅ is the witness we require,reducing the goal to

∀x, x ∈ ∅ ↔ x ∈ X ∧ P x,

where one direction is trivial while the other produces a contradiction with ourassumption.

Case 2: For the remainder of the proof we can now assume that we have xdef ∈ Xavailable and also P xdef. All the ingenuity now goes into the construction of theappropriate function to map over X:

F x B εset (λy⇒ P x ∧ x = y ∨ ¬P x ∧ xdef = y).

We have again employed the same trick that we used to define Sep for this auxil-iary function, and of course we obtain a similar obligation before we can continue,namely

∀x, P x ∧ x = F x ∨ ¬P x ∧ xdef = F x (4.4)

which, by the property of ε, again depends on an existence proof for an arbitrary x:

∃y, P x ∧ x = y ∨ ¬P x ∧ xdef = y.

At this point we can simply do a case split on whether P x holds or not. If itdoes, x will satisfy the goal, and if it does not we can use xdef instead. We willshortly make use of two facts that follow from (4.4), namely

∀x, P x→ x = F x (4.5)

40

4.4. Separation

and∀x, ¬P x→ xdef = F x. (4.6)

With this additional structure we can now get back to our main goal and use theset {F x | x ∈ X} as our witness. The goal then becomes

∀x, x ∈ {F x | x ∈ X} ↔ x ∈ X ∧ P x.

The backward direction is simple using ReplI and (4.5), while the forward dir-ection requires yet another case split. Since x is a member of {F x | x ∈ X} theremust be an x′ ∈ X such that x = F x′. Now P x′ may or may not hold. In the firstcase we can use (4.5) to obtain x′ = F x′ and hence x = x′. This case now followsfrom our assumptions. If it does not hold, it follows from (4.6) that xdef = F x′

and hence x = xdef, which again reduces the goal to known assumptions. Thiscompletes the proof.

�

At this point it is interesting to note that we deviated from our usual approach,i.e., we proved the defining property of the separation on sets directly. In generalthis tends to be the easier approach for definitions based on εset. In the followingit is still easier though to work with separate introduction and elimination lemmasand we state them here without proof. They follow immediately from the correct-ness result we have just proved.

Lemma SepI : forall X, forall P: set → Prop, forall x,x ∈ X→ P x→ x ∈ (Sep X P).

Lemma SepE1 : forall X P x, x ∈ (Sep X P)→ x ∈ X.Lemma SepE2 : forall X P x, x ∈ (Sep X P)→ P x.Lemma SepE : forall X P x, x ∈ (Sep X P)→ x ∈ X ∧ P x.

As before we also have a couple of additional facts about this construction whichare useful in later developments.

First of all, it should be clear that any separation is a subset of its bounding set.This follows immediately from SepE1:

{x ∈ X | P x} ⊆ X, (4.7)

which can also be expressed as

{x ∈ X | P x} ∈P(X). (4.8)

At this point we can also extend our knowledge about the closure of our Grothen-dieck universes to sets formed by separation.

Lemma 5 (GU closed under Separation). Given any Grothendieck universe of theform GUN and a set X contained in this universe, then any separation over X isalso contained in GUN .

41


Proof. We know that GUN is closed under taking the powerset of X, i.e., P(X) ∈GUN . Combining this with (4.8) and the fact that GUN is a transitive set we obtainthe desired result, {x ∈ X | P x} ∈ GUN �

Next we have a few results about particular bounding sets, namely

{x ∈ ∅ | P x} = ∅ (4.9)

and the somewhat more useful

{x ∈ 1 | P x} = ∅ ∨ {x ∈ 1 | P x} = 1,

which follows from the fact that subsets of 1 are either ∅ or 1 (see (4.1)). Since 1 isjust a special singleton set the following more general variant also is of some use:

({x ∈ {x} | P x} = {x} ∧ P x) ∨ ({x ∈ {x} | P x} = ∅ ∧ ¬P x).

And finally, if we actually know that a particular separation yields the empty set,it is easy to argue that none of the members of the bounding satisfy the property:

{x ∈ X | P x} = ∅ → ∀x ∈ X,¬P x.

Now that we know about separation, it is interesting to think about the universalclass again, i.e., a class containing all sets, and whether it can be encoded as a set.We bring this up as it is relevant in the context of set intersection, the constructionwe will be dealing with next. To see why a universal set cannot exist, considerthe following contradiction. Assume a universal set U exists, then the separation{x ∈ U | x < x} over this universal set U is a set encoding the Russell class. As weargued above, however, there is no such set and therefore U cannot exist.

What does this have to do with intersection? Well, the class intersection of a setM is easily defined as

C∩M B {x | ∀A ∈ M, x ∈ A},

but when we consider the case for M = ∅ we observe that∩C ∅ is the universal

class: the defining condition is vacuously true for all sets. This poses a difficulty toencoding

∩C as a set.We can work around this issue with a separation over a sufficiently large bound-

ing set, allowing correct behaviour of the intersection operation over any non-empty set M. To obtain this consider the following. Any element x that is partof the intersection is a member of all the sets A in the non-empty set M. Wecan weaken this specification to say that there is a set A ∈ M which contains x.We recognise the latter as the union over M, which makes sense since we expect∩

M ⊆ ∪M. The idea is taken from Fraenkel’s historical introduction in [Ber58]and can also be found in the recent work of Barras [Bar12]. Our actual definitionis then ∩

M B {x ∈∪

M | ∀A ∈ M, x ∈ A},which is formally given as

42

4.5. Ordered Pairs

Definition Inter : set → set B λ M .Sep (Union M) (λ x : set . forall A : set, A ∈ M→ x ∈ A).

The defining property is of course

inhset M −→ (∀x, x ∈∩

M ↔ ∀A ∈ M, x ∈ A),

which is established by proving the following two results in Coq.

Lemma InterI : forall x M, inhset M→ (forall A, A ∈ M→ x ∈ A)→ x ∈ Inter M.Lemma InterE : forall x M, x ∈ Inter M→ inhset M ∧ forall A, A ∈ M→ x ∈ A.

We point out that the given definition leads to the following somewhat counter-intuitive result: ∩

∅ = ∅.

It is an immediate combination of Theorem 5, Part 1 and Equation (4.9). Thisis technically unproblematic since intersection is now defined on all sets, but weexpect correct behaviour only from the nonempty ones. Finally we look at universeclosure.

Lemma 6 (GU closed under Intersection). Given any Grothendieck universe of theform GUN and a set X contained in this universe, then the intersection over X isalso contained in GUN .

Proof. Follows immediately from Lemma 5 and the closure under the union oper-ation. �

As a last remark we could also consider binary intersection, which can be ob-tained from this arbitrary intersection along the same lines we employed for binaryunions earlier. This, however, has not been formalised.

4.5 Ordered Pairs

In the context of set theory there are two notions of pairing, namely unorderedpairs and ordered pairs.

The unordered version we have seen above is axiomatic in our development.They are explicitly not equipped with a notion of component ordering, i.e.,

{a, b} = {b, a}

and concepts like first and second component do not even make sense. Apart frombeing a fundamental building block of our theory, they are occasionally quite usefulin higher-level developments. They should, however, not be confused with theconcept most people tend to have in mind when talking about pairs.

43


Ordered pairs, on the other hand, are exactly what most mathematicians think ofwhen talking about pairs and with these we are able to precisely identify the firstand second component. Apart from the additional ordering, these pairs also comewith projection operations which provide access to the two components of a pair.For the ordered pair of the sets a and b and its projections we write 〈a, b〉 and

π1〈a, b〉 = a,

π2〈a, b〉 = b.

This notation is used to clearly distinguish the two versions of pairs and to avoidconfusing their different properties.

In most mathematics texts, ordered pairs are just assumed to exist without fur-ther elaboration, effectively turning them into axiomatic building blocks. Here,however, we are carefully building up the foundations and it is not necessary to as-sume the existence of ordered pairs. They can be a defined entity instead, given themachinery we already have. There are several ways to accomplish this. Regardlessof the encoding we choose, we should be able to prove the characteristic propertyof ordered pairs:

〈a, b〉 = 〈c, d〉 ↔ a = c ∧ b = d. (4.10)

We use a construction which was suggested by Kuratowski in 1921 [Kur21]. Ithas the nice property that it is self-contained. It does not rely on some form ofmarker sets (as in {{a,O}, {b, 1}}) to distinguish the components. Due to this andother favourable properties1, these so-called Kuratowski Pairs have become the defacto standard construction in many texts on the foundation of mathematics and settheory.2

We define ordered pairs as

〈a, b〉 B {{a}, {a, b}},

with the following projection operations:

π1 p B∪∩

p

π2 p B∪{x ∈∪

p | x ∈∩

p→∪

p =∩

p}

The corresponding formal encoding is

Definition op k : set → set→ set B λ x y: set . UPair (Sing x) (UPair x y).Definition π1 : set → set B λ p . Union (Inter p).Definition π2 : set → set B λ p . Union (Sep (Union p)

(λ x : set . x ∈ Inter p→ Union p = Inter p)).

1Another self contained construction is {a, {a, b}}. This version, however, requires the Axiom ofRegularity in order to obtain the characteristic property of pairs (4.10), while Kuratowski Pairs donot.

2See, for example, Chapter 2.1 in [HJ99], although they do not call them by this name.

44

4.5. Ordered Pairs

Let us take a minute and motivate these definitions. First of all we remindourselves that taking the union over a singleton set produces exactly that singleelement, i.e., speaking informally, this allows us to strip one layer of set braces.This is exactly the purpose of the outer union operation in both definitions. Then,for the definitions to make sense, the sets over which these unions are taken shouldbe singletons sets, more precisely, when P is the ordered pair 〈a, b〉, we want toobtain the sets {a} and {b}, respectively.

Furthermore, looking at the next building blocks in our definition, let us see whathappens when we take the union or the intersection over an ordered pair.∪

〈a, b〉 = {a} ∪ {a, b} = {a, b}∩〈a, b〉 = {a} ∩ {a, b} = {a} (4.11)

At this point it should be intuitively clear that the definition of the first projectionworks as intended, as long as we apply it to actual pairs.

Lemma 7. For all sets x and y

π1〈x, y〉 = x

Proof. Trivial using (4.11). �

For the second projection the situation is slightly more complicated. The readermight expect something of the form π2P B

∪{x ∈ ∪ P | x < ∩ P} but this breaksdown in the case where the two components are equal. Hence we need to expressthe fact that the component we are looking at is either not in the intersection, inwhich case we have our hands on the correct one, or, if it happens to be in there butthe components are equal, then there effectively is only one component, preciselythe one we are looking for. Note that inside the definition we do not have directaccess to the two components (we are defining projections for generic set P), butstating that the union and the intersection coincide amounts to the same thing, atleast for our definition of pairs.

Lemma 8. For all sets x and y

π2〈x, y〉 = y

Proof. We have to consider two cases here, namely whether x and y are equal ornot. Let us deal with the inequality first.

Case 1: Take x , y, then∪〈x, y〉 = {x, y} , {x} = ∩〈x, y〉. It is clear that

(x ∈ {x} → {x, y} = {x}) is false, while (y ∈ {x} → {x, y} = {x}) is vacuously true.The expression becomes π2〈x, y〉 =

∪{y} = y, as required.Case 2: Now assume x = y, i.e., we are dealing with pairs of the form 〈y, y〉

and note that∪〈y, y〉 = {y} = ∩〈y, y〉. In this scenario, the only instance of the

property, namely (y ∈ {y} → {y} = {y}), is trivially true. Hence we again obtainπ2〈y, y〉 =

∪{y} = y, as required.�

45


Now that we have established that our projection operations behave as desiredwe can use this information to finally obtain a proof of (4.10)

Theorem 10 (Characteristic Property of Ordered Pairs). For all sets a, b, c and d,

〈a, b〉 = 〈c, d〉 ↔ a = c ∧ b = d.

Proof. Let us look at the two directions separatelyCase← : Trivial using the substitution property of equality.Case → : We have π1〈a, b〉 = a from Lemma 7 and therefore π1〈c, d〉 = a by

assumption. Again from Lemma 7 we obtain the desired equality c = a (up tosymmetry). The proof of b = d is analogue, using π2 in place of π1. �

Formally it is helpful to be able to classify, which of the sets in our universe ofdiscourse are well-formed ordered pairs, and which are not. Among other thingsthis helps us with the clean definition of functions later in Section 4.6.

Here is the corresponding extract from the formal development, where (x,y)k isjust notation for op k x y.

Definition is pair : set → Prop B λ p . exists x, exists y, p = (x,y)k.

It is now easy to express properties which require their arguments to be pairs.Take eta expansion as an example:

Lemma op k eta : forall p, is pair p ↔ p = (π1 p, π2 p)k.

We are normally only concerned with the forward direction of this result, butnote that this provides us with an alternative classification of which sets are pairs,i.e., all those where forming a pair out of the first and second projection producesthe original set.

A concept that is closely related to ordered pairs is the notion of a CartesianProduct of two sets, which provides a means to express sets of ordered pairs.3

Using the axioms of replacement and the union over a family of sets we can definethem as

A × B B∪a∈A

{〈a, b〉 | b ∈ B}.

Formally we have

Definition CProd : set → set→ set Bλ A B . FamUnion A (λ a . Repl (λ b . (a,b)k ) B).

We prove

Lemma CProdI : forall a A b B, a ∈ A→ b ∈ B→ (a,b)k ∈ CProd A B.Lemma CProdE1 : forall p A B, p ∈ CProd A B→ π1 p ∈ A ∧ π2 p ∈ B.Lemma CProdE2 : forall p A B, p ∈ CProd A B→ is pair p.

3For those familiar with the concept of typing: a Cartesian Product is the type of an ordered pair

46

4.6. Functions

and thus establish the defining property of the cartesian product, namely

∀p : set, p ∈ A × B↔ π1 p ∈ A ∧ π2 p ∈ B ∧ is pairp.

There are two simple properties that will be helpful in later developments. Inboth cases the proofs use extensionality of sets, resulting in contradictions of theform a ∈ ∅.

Lemma 9. For all sets B, ∅ × B = ∅.

Lemma 10. For all sets A, A × ∅ = ∅.

We have now constructed everything which is necessary to start talking about anotion of functions, where ordered pairs play a central role.

4.6 Functions

In this section we will discuss three interlocking constructions that equip our settheory with typed functions. Firstly there are abstractions or λ-terms which rep-resent the functions themselves, written as λx ∈ X.ex. The subscript indicates thepossibility of free occurrences of x in e. These abstractions will be applied to ar-guments so the next thing we require is an application mechanism ( f x) and finallywe want to be able to express function spaces, or in other words, function types. Infact we will be dealing with dependent functions and for these one usually speaksof dependent products instead. These are written as Πx ∈ X.Yx. For the specialcase where x does not occur free in Y we use the more common notation X −→Y .

Formally we want three new set formers at the following types,

ap : set→ set→ set,

lam : set→ (set→ set)→ set,

Pi : set→ (set→ set)→ set.

The Pi provides dependent function spaces but in several cases a non-dependentversion is sufficient and easier to work with. We give the definition here, but theconstruction is formally not very interesting so we will not discuss it at any depth:

Arr : set→ set→ set B λX Y. Pi X (λ . Y).

What is interesting though are the types of the new constants. The simplest oneis function application. It takes two sets and returns the set that is the result ofapplying the first to the second. We observe that it is technically possible to feedarbitrary sets to this set former but we only expect sensible behaviour if the first setactually encodes a function and the second is a suitable argument. We are going

47


to investigate to some degree what happens beyond these implicit boundaries as ithelps us with some of the proofs, but we will get to that later.

For abstractions and products the types are slightly more involved since we haveto deal with a variable binding operation. In both cases the first set given to the setformers is the argument domain of the encoded function or product. The secondcomponent is in both cases a metalevel function which encodes the body with afree occurrence of a variable. In other words, we use Coq’s metalevel bindingmechanism to encode the binding operation for abstractions and products.

We call these new constructions functions but to justify this name they shouldsatisfy a number of properties. We state these properties loosely to get an intuition,i.e., we avoid being too pedantic about side conditions for the sake of a simplepresentation. We will of course later formally prove that the properties hold in fulldetail for the actual definitions of our new constructs.

The key defining property we expect from functions is β-reduction which relatesabstraction and application:

ap (lam X e) x = ex.

In addition to that we have the dependent products which should behave like thetype of abstractions. In the language of sets this is a statement about set contain-ment for suitably related e and Y:

lam X e ∈ Pi X Y.

The last core property of interest is the fact that everything taken from a productcan be used as a function with respect to application and return something sensible:

f ∈ Pi X Y → ap f x ∈ Yx.

If our only goal would be a formalisation of set theory we could simply usethe well known standard graph encoding and finish the section at this point. Wehave however mentioned a number of times already that we plan to use this settheory for the construction of models for type theory, which poses an additionalconstraint and forces us to seek an alternative solution. To see where the defaultsolution breaks down and to motivate both the extra constraint and the solution wehave to take a small detour.

In the standard graph encoding, a function or abstraction is represented by a setof ordered pairs where the first components belong to the domain of the functionand the second components to its co-domain. This is a binary relation which hasto be functional to be considered a valid encoding. That is, for each element x inthe domain there is exactly one pair of the form 〈x, y〉 in the encoding set, with ybeing a member of the co-domain. In fact, any functional binary relation (i.e., setof ordered pairs) represents some function and its domain can be reconstructed by

48

4.6. Functions

..x .y . x.

e1

.e2

.

en

.

y

Figure 4.3: A function mapping the value x to the value y as represented in the standardgraph encoding (left) and in the Aczel encoding (right), respectively.

just gathering up all first components.4 The corresponding application operator iseasy. We simply have to find the pair in the set where the first component matchesthe argument of the function application and project out the corresponding secondcomponent. This encoding is clean and simple but unfortunately insufficient forthe following reason.

In Section 4.1 we mentioned that 2 can be viewed as the type of propositions5

and in this light it is interesting to look at logical implications. We are going tobuild on Kolmogorov’s idea that function types and logical implications are thesame [Kol32, Hey74]. In particular this entails that functions inhabiting functionspaces are proofs of the corresponding implications.

We would expect that a logical implication constructed from two propositionsis itself a proposition, i.e., we are looking for a closure property of 2 under theconstruction of function spaces, formally6

∀A B ∈ 2, A −→B ∈ 2. (4.12)

This, however, is not satisfied when we use the standard encoding. For a counter-example, take the function type 1 −→1. Read as a logical implication we wouldexpect this to be true, i.e., equal to the set 1. Looking at the (informal) definitionsabove it is easy to see that the constructed function space will contain exactly onefunction, namely the one with the single mapping ∅ 7→ ∅. Hence with the standardencoding we would have 1 −→1 = {{〈∅, ∅〉}} < 2.

The solution is an encoding of functions which was envisioned by Aczel[Acz98], explicitly designed to satisfy (4.12) in the context of modelling impre-dicative type universes. The idea is to still encode abstractions as sets of orderedpairs but instead of representing the mapping x 7→ y with a single pair 〈x, y〉 we

4This is relevant since we do not have this property for the encoding we actually choose.5To be precise, we intend to use 2 as the interpretation of ECC’s lowest, impredicative type

universe Prop in our planned model construction.6We use the non-dependent version. The dependent version is mostly the same, but hides the

concept behind a less clear expression.

49


include one pair 〈x, ei〉 for each ei ∈ y. The resulting relation can still be consideredas a graph albeit in most cases a non-functional one. In general though, this ana-logy should be avoided since the semantics of an edge have changed considerably.Figure 4.3 illustrates the difference with a function that has as its only mappingx 7→ y.

Clearly we have to adjust the application mechanism. Given an encoded func-tion, i.e., a set of ordered pairs, and an argument set, we now have to collect all theelements associated with the argument and gather them up into a set. This set isthen returned as the result of the application.

With this principle in mind we pose the following definitions:

ap f x B{π2 p | p ∈ {p ∈ f | π1 p = x ∧ is pair p}}lam X F B

∪x∈X

{〈x, y〉 | y ∈ F x}

Pi X Y B{ f ∈P(X ×∪

(∪x∈X

Yx)) | ∀x ∈ X, ap f x ∈ Yx}

At a first glance it is of course neither obvious that these definitions give usAczel-encoded functions nor that they satisfy any of the properties we required inthe beginning. In the remainder of this section we will work through a number ofproofs that establish the necessary properties and further related results and thusshow that we actually have a workable encoding of set-theoretic functions in ourdevelopment. In comparison to Barras [Bar10] we construct these functions dir-ectly while he first produces the standard encoding and then proceeds to embedthis into the Aczel representation. This primarily affects the definition of productswhile abstractions and applications are essentially defined in the same way we did.

To get started with the formal results we present a rather curious observation,namely that Cartesian products and constant abstractions coincide in our set theory.We were not able to find a real use case for this but think that it is interestingnonetheless.

Lemma 11. The Cartesian product of sets A and B coincides with the constantlyB abstraction on domain A:

A × B = lam A (λ . B).

Proof. Immediate from the definitions. �

We begin our main analysis by showing that abstractions only contain pairs.

Lemma 12. Given a p ∈ lam X F there exist sets x ∈ X and y ∈ F x such thatp = 〈x, y〉.

50

4.6. Functions

Proof. Unfolding the definition of abstractions reveals p ∈ ∪x∈X{〈x, y〉 | y ∈ F x}.From the elimination principle FamUnionE we obtain an x ∈ X for which p ∈{〈x, y〉 | y ∈ F x}. At this point ReplE tells us that there exists a y ∈ F x such thatp = 〈x, y〉, completing the proof. �

A related result concerns the behaviour of application with arbitrary sets.

Lemma 13. Given an arbitrary set f , then f contains a pair 〈x, y〉 iff applying fto x yields a set containing y, formally

y ∈ ap f x↔ 〈x, y〉 ∈ f .

Proof. We look at each of the implications in turn. For the forward direction wecan use ReplE to obtain y = π2 p for some p ∈ {p ∈ f | π1 p = x ∧ is pair p}. Thistells us that p ∈ f and π1 p = x and additionally that p is in fact a pair, which meansthat we can η-expand it to 〈π1 p, π2 p〉. We can now substitute to obtain the desiredresult.

For the other direction we have a pair 〈x, y〉 ∈ f and it is obvious thatπ1〈x, y〉 = x as well as is pair 〈x, y〉. We can therefore deduce that 〈x, y〉 ∈{p ∈ f | π1 p = x ∧ is pair p}. If we now replace everything in the latter with thesecond projection π2 we have established that π2〈x, y〉 ∈ ap f x. The desired resultthen follows from Lemma 8. �

Now, before we look at any of the subtler results we are going to verify that ourconstructions exhibit the fundamental property of functions: an abstraction appliedto an argument (of its domain) β-reduces.

Theorem 11 (β-reduction of Aczel-encoded functions). Given a domain set X anda metafunction on sets F, then for every x ∈ X, the following reduction equationholds:

ap (lam X F) x = F x.

Proof. We show inclusion both ways. Assume we have a z ∈ ap (lam X F) x,then ReplE and Theorem 9 allow us to conclude that there is an ordered pair p ∈lam X F for which π1 p = x and π2 p = z. FamUnionE now tells us that p is obtainedby replacing some y ∈ F x with 〈x, y〉. Hence we can conclude y = z and thereforez ∈ F x as required.

For the other direction we take an arbitrary y ∈ F x, which we can equallyexpress as π2〈x, y〉 ∈ F x. To show that π2〈x, y〉 ∈ ap (lam X F) x, it is sufficientto establish that is pair 〈x, y〉 and π1〈x, y〉 = x, both of which are trivial, as wellas 〈x, y〉 ∈ lam X F. As we have seen several times now we can establish themembership in a union over an indexed family, if we can present a witnessingindex, where we choose x of course, and establish that our target set, here 〈x, y〉,is contained in the indexed member, here {〈x, y〉 | y ∈ F x}. This follows from ourassumption and completes the proof. �

51


The reader may have noticed that the result we just proved relies on the argumentbeing contained in the declared domain of the abstraction. This immediately bringsup the question of what happens if we we know that the argument is not in thedomain. As it turns out we obtain the empty set.

Lemma 14. Applying a given abstraction lamX F to an argument x outside of itsdomain X, that is x < X, always yields the empty set, formally

x < X −→ ap (lam X F) x = ∅.

Proof. We assume that the redex is non-empty, i.e., we have a w ∈ ap (lam X F) x,and produce a contradiction. From Lemma 13 it follows that 〈x,w〉 ∈ lam X F andfrom Lemma 12 we can further conclude that there is a pair 〈z, y〉 = 〈x,w〉 suchthat z ∈ X. From the characteristic property of ordered pairs we deduce that z = xand therefore x ∈ X which contradicts our initial assumption and completes theproof. �

Another interesting form of redex involves the function encoded by the emptyset, namely the constantly ∅ function, which is agnostic with respect to domains.

Lemma 15. When we apply the empty set to any set, we always obtain ∅, formally

ap ∅ x = ∅.

Proof. We have ap ∅ x = {π2 p | p ∈ {p ∈ ∅ | π1 p = x ∧ is pair p}} = {π2 p | p ∈ ∅} =∅ where we have used the definition of ap, Equation (4.9) and Theorem 7 for theequalities, respectively. �

Now that we have looked at β-reduction and various associated results it is inter-esting to move to the issue of typing. We begin with the most important propertyfirst, namely that products are the types of abstractions. In the context of sets typingis represented by set membership, yielding the following lemma.

Theorem 12. Given a domain set X and two metafunctions F and Y such that∀x ∈ X, F x ∈ Yx, we have

lam X F ∈ Pi X Y.

Proof. To establish this result we have to show two things. First the abstraction hasto be contained in the bounding set of the separation which defines Pi and secondlywe must satisfy the condition of that separation. As it turns out, the second is theeasier of the two and this is where we begin.

Our goal then is ∀x ∈ X, ap (lam X F) x ∈ Yx. Since x ∈ X we can β-reduce theredex to F x which is contained in Yx by assumption.

For the bounding set we have to show that lamX F ∈ P(X × ∪(∪

x∈X Yx)) orequivalently lamX F ⊆ X×∪(

∪x∈X Yx). From Lemma 12 we know that everything

contained in the abstraction is a pair 〈x, y〉 such that x ∈ X and y ∈ F x. Establishing

52

4.6. Functions

that 〈x, y〉 is also a member of the given Cartesian product reduces to showing thatthe components of the pair are in their respective domains. For x this is trivial. Forthe second component we have to find a witness in

∪x∈X Yx containing y. This is

clearly going to be F x. But we still need F x ∈ ∪x∈X Yx which again demandsa witness, this time the index for the family Y . This obviously is x and F x ∈ Yx

follows from our assumption. �

We have established that abstractions are contained in products. What else dothey contain? In fact we are not really concerned about the exact shape of theirelements but about their behaviour under application. Loosely speaking, a productshould only contain functions and applying these functions to sensible argumentsshould produce well-typed results, i.e., things contained in the dependent result-type domain. Formally we want the following.

Theorem 13. Given a set f ∈ Pi X Y and an argument x ∈ X, whenever we applyf to x we obtain something in the set Yx, i.e.,

ap f x ∈ Yx.

Proof. This one is actually rather straightforward as we simply have to use SepE2to project out the condition of the separation that encodes the product, immediatelyyielding the desired result. �

We also have the non-dependent case as a corollary.

Corollary 2. Given a set f ∈ X −→Y and an argument x ∈ X, whenever we applyf to x we obtain something in Y, i.e.,

ap f x ∈ Y.

Proof. Trivial. �

The last of the typing results we will look at is also the closure property thatforced us to use Aczel’s non-standard function encoding. The result is proved herein its most general, dependent case.

Theorem 14 (The set 2 is closed under the construction of function spaces). Givena domain X and a co-domain Y, then the construction of their product Pi X Ysatisfies the following closure property:

(∀x ∈ X, Yx ∈ 2)→ Pi X Y ∈ 2.

53


Proof. To prove this we assume ∀x ∈ X, Yx ∈ 2 and recall Theorem 8 wherePart 5 allows us to conclude

∪x∈X Yx ∈ 2. Then from Theorem 5, Part 4, we obtain∪∪

x∈X Yx = ∅. This tells us that Pi X Y = { f ∈P(X × ∅) | ∀x ∈ X, ap f x ∈ Yx} ={ f ∈ 1 | ∀x ∈ X, ap f x ∈ Yx}. Now any separation is a subset of its bounding set(4.7), hence we know that Pi X Y ⊆ 1. Using the introduction rule for the powersetconstruction we have Pi X Y ∈P(1), which we have shown earlier to be equivalentto Pi X Y ∈ 2. This completes the proof. �

We have established a number of necessary results of our function representa-tion to ensure that it behaves sensibly and we will now complete the picture with anumber of equational properties. First we will see that both the abstraction and theproduct construction exhibit extensional equality when provided with metafunc-tions that are extensionally equal on a given domain.

Lemma 16. Given a domain set X and two metafunctions F1 and F2 on sets whichsatisfy ∀x ∈ X, F1 x = F2 x, then the sets lam X F1 and lam X F2 are equal.

Proof. This proof is trivial. We simply unfold the definitions and see that F1 andF2 are only ever applied to arguments from the domain set X. Since we know thatF1 and F2 agree on X, we immediately obtain the desired equality. �

Lemma 17. Given a domain set X and two dependent co-domain metafunctionson sets Y1 and Y2 which satisfy ∀x ∈ X, Y1 x = Y2 x, then the sets Pi X Y1 andPi X Y2 are equal.

Proof. The key idea is the same we used above. After unfolding definitions weobserve two occurrences of Y1, and respectively two of Y2. They are again onlyapplied to arguments from X so we know that the posed equality holds. �

These proofs look almost trivial from the mathematical point of view, while theyare in fact rather involved in the Coq formalisation. The reason for this discrepancyis an inherent notion of extensionality used by mathematicians that is not guaran-teed in type theory. The heart of the problem relates to the fact that we cannotproof-theoretically “see” that the function arguments in question all belong to theappropriate domain. This fact is only uncovered after disassembling all the in-volved set constructions. At that point the replacement can be performed and thesets have to be reassembled again. The full details can be found in the correspond-ing proof script.

To conclude we finally show that we have extensionality for all sets that behavefunctionally.

Theorem 15. Given two functions f , g ∈ Pi X Y, we have

(∀x ∈ X, ap f x = ap g x)→ f = g.

54

4.7. Finite Ordinals & Natural Numbers

Proof. Since f and g are sets we use set extensionality to show inclusion in bothdirections. The proofs are symmetric so we only show f ⊆ g. Assume we have aset p ∈ f and we want to establish that p ∈ g. The first step towards this goal is toshow that p is a pair.

To achieve this we look at f ∈ Pi X Y , from which we know that f is partof a separation and therefore also of the corresponding bounding set: f ∈ P(X ×∪

(∪

x∈X Yx)). This is equivalent to f ⊆ X×∪(∪

x∈X Yx). Now since p ∈ f it followsthat also p ∈ X ×∪(

∪x∈X Yx). At this point we can use CProdE2 to conclude that

p = 〈x, y〉 for some x ∈ X and y ∈ ∪(∪

x∈X Yx).

Now that we have 〈x, y〉 ∈ f , we can exploit Lemma 13 to derive y ∈ ap f x.With the initial assumption we therefore also know that y ∈ ap g x. UsingLemma 13 again we obtain 〈x, y〉 ∈ g which is the same as p ∈ g, completingthe proof. �

4.7 Finite Ordinals & Natural Numbers

To conclude the development we describe the construction of natural numbers inour set theory. In Coq, the natural numbers are built up inductively from a constantO : N and a successor function S : N→ N according to the following rules:

NatOO : N

NatSn : N

S n : N

As such, these natural numbers are part of our metatheory as well as the type the-ory we eventually intend to model. The former means that we can (and will) usenatural numbers in the process of the set constructions, while the latter suggests ablueprint for the shape our set construction will require. The fact that we can con-struct natural numbers entails two things. Firstly it is an example for a structuredset of infinite size, i.e., we will in principle be able to model infinite types. Thetype of natural numbers is considered the canonical, non-trivial inductive type inCiC and the fact that we can model it suggests that we will be able to model a largenumber of “simple” inductive types.

We will first construct the finite ordinals in our set theory and then define an(almost) isomorphism between the finite ordinals we have constructed and Coq’stype N. This ensures that our finite ordinals are a suitable representation of naturalnumbers.

Before we start with the ordinals we recall that our sets are well founded sincewe have the induction principle Ind. We will make use of the principle to establishthat so-called one-cycles (X ∈ X) and two-cycles (X ∈ Y ∧ Y ∈ X) are impossible.We use the latter to prove that the successor ordinal construction we will see belowis injective.

55


Lemma 18. The TG set theory does neither contain one-cycles nor two-cycles,formally

∀X, X < X, (4.13)

∀X Y, X ∈ Y → Y < X. (4.14)

Proof. We start with (4.13) and immediately apply Ind, which reduces the goal to

∀X, (∀x ∈ X, x < x)→ X < X

We proceed by assuming an X such that X ∈ X as well as ∀x ∈ X, x < x and pro-duce a contradiction by instantiating the universal quantification with X, yieldingX < X. This immediately contradicts our assumption of X ∈ X and completes theproof of (4.13).

For (4.14) we again immediately apply the induction principle on the member-ship in set X to reduce the goal to

∀X, (∀x ∈ X, ∀Y, x ∈ Y → Y < x)→ ∀Y, X ∈ Y → Y < X

The proof proceeds by contradiction under the following assumptions. We havesome X and Y such that X ∈ Y and Y ∈ X as well as our induction hypothesis

∀x ∈ X, ∀Y, x ∈ Y → Y < x

The trick now is to instantiate x with Y (which we know is in X) and Y with X,which allows us to produce a contradiction from Y ∈ X and X ∈ Y , both of whichwe have assumed. This completes the proof. �

We need one more auxiliary construct before we can start the constructionproper, namely the iterator function on Coq’s type N. It is recursively definedas

Fixpoint iter (n:nat) {X:Type} (f : X→ X) (x: X) Bmatch n with| O⇒ x| S n’ ⇒ f ( iter n’ f x)end.

The iterator takes a natural number n and applies a given function f n times to abase value x. If we instantiate f and x with the constructors S and O respectively,we are just reconstructing the original natural number. We use this idea to mirrorthe structure of a natural number with our constructors for ordinals. That is, we willnow pose a base ordinal and a successor function, and feed them to the iterator.

We basically require three things, the two constructors we already mentionedand the set of all finite ordinals. The latter is an infinite set and we will describeit with separation so we require a sufficiently large bounding set. As it turns out,

56


the set of all hereditarily finite sets, in our development GU∅, does the trick. Thisleads us to the following definitions.

ordO B ∅ordS N B N ∪ {N}FinOrd B {N ∈ GU∅ | ∃n : N, iter n ordS ordO = N}

We have defined the sets ∅, 1 and 2 as the first three ordinals and it should nowbe clear how their definitions arise from applying ordS to ordO 0, 1 and 2 timesrespectively.

Before we go any further, we should briefly discuss the definition of FinOrd, asit makes use of a new component in our metatheory, namely the inductive type N.This makes several of the following proofs and constructions quite convenient, butone may justly ask whether we could have done it without resorting to inductivetypes. In fact we can, forming the least set closed under ordO and ordS by repres-enting all sets satisfying the necessary closure properties and then intersecting overthem. Further details can be found in the introduction of [Ber58]. Establishing anequivalence between the two encodings should be straightforward but we have notincluded it in our formalisation.

We are now going to establish a few results about these definitions which ascer-tain their suitability to encode the inductive type of natural numbers. For a startwe point out that our Grothendieck universes are closed under taking the successorordinal, which follows immediately from the closure results for singleton sets andbinary unions. Next we want to look at some “typing” results. For each of theconstructs above we show in which set they are contained. For ordO and ordS Nthis will be FinOrd, while the latter itself lives in GUGU∅ .

For ordO = ∅, the proof reduces to showing that ∅ ∈ GU∅, which is trivial, andto provide a natural number witness that collapses the iterator to ordO, which ofcourse is 0.

Lemma 19. For any finite ordinal N ∈ FinOrd, its successor ordinal is also inFinOrd, ordS N ∈ FinOrd.

Proof. From our assumption we know that N ∈ GU∅ and via the closure propertyoutlined above ordS N ∈ GU∅. Furthermore there is a natural number n such thatiter n ordS ordO = N. To establish that ordS N ∈ FinOrd we again have to providetwo things, namely that ordS N ∈ GU∅, which we have already established, andwe have to provide a witness for the iterator, which of course is S n. �

With a simple induction on natural numbers7 and the results we have just shownwe can also establish that

∀n : N, iter n ordS ordO ∈ FinOrd. (4.15)7Technically we only require the induction to establish that iter n ordS ordO ∈ GU∅, but since

we have the preceding results both the base case and the induction step follow immediately if weperform the induction as suggested.

57


Finally, we have FinOrd ⊆ GU∅ and therefore FinOrd ∈P(GU∅). From GU∅ ∈GUGU∅ it additionally follows that P(GU∅) ∈ GUGU∅ and using transitivity weobtain FinOrd ∈ GUGU∅ as claimed above.

This leads us to one of the more relevant results especially since we want toencode an inductive type with our ordinals. The set ordO will never coincide witha set that has ordS as its topmost set former, i.e., they produce distinct sets. SinceordO is ∅ it is sufficient to show that everything produced with ordS is inhabited.From the fact that nothing can be both inhabited and not inhabited at the same timeit then follows as a simple corollary that the sets cannot coincide.

Lemma 20. For any set N, ordS N is inhabited.

Proof. We observe that ordS N = N ∪ {N} and pose N as the witness of the inhab-itance. N ∈ N ∪ {N} then follows from N ∈ {N} by BinUnionI2, while the latter isan instance of SingI, completing the proof. �

Corollary 3. Sets produced by ordO and ordS are distinct, formally

∀N : set, ordS N , ordO.

Proof. Trivial. �

When we look at our ordinals in the light of an inductive type we also requireinjectivity of the constructors, which in our case means that we want ordS to beinjective.

Lemma 21. The set former for the successor ordinal ordS is injective,

ordS N = ordS M → N = M.

Proof. We assume ordS N = ordS M and can easily establish that N ∈ ordS Nand hence N ∈ ordS M. And similarly M ∈ ordS N. Now N ∈ ordS M entailsN = M ∨ N ∈ M and symmetrically M = N ∨ M ∈ N. For the left disjunctsrespectively the result is immediate, so we are left with the case where N ∈ M andM ∈ N. This however contradicts Lemma 18, completing the proof. �

At this point we can present the isomorphism between our finite ordinals andCoq’s natural numbers. An isomorphism consists of two mappings, one from Coq’sN to set and one in the other direction. To be an isomorphism these two mappingshave to be inverses of each other, i.e., composing the two in either way shouldyield the identity. At this point we have to be a bit careful. The forward functionwill always be applied to natural numbers, and by definition, will only yield sets inFinOrd. For the other direction, however, we observe that our set construction isinherently untyped and hence the function will be defined on the whole type set.We do however only expect correct behaviour for those sets taken from FinOrd.We should kept this in mind when we call the pair of functions an isomorphism.

58


Given an arbitrary set, then projecting out a natural number and embedding thatagain as a set is certainly not guaranteed to behave as the identity. Without furtherado, here are the definitions:

Definition FinOrd embed (n: nat) : set B iter n ordS ordO.Definition FinOrd proj (N: set) : nat B

match dit {n: nat | iter n ordS ordO = N} with| inl ( exist n ) ⇒ n| inr ⇒ 0

end.

We use the mathematical notations embed n and proj N for FinOrd embed n andFinOrd proj N, respectively.

While the embedding operation is straightforward, the projection requires somediscussion. Morally we know that we only want to project natural numbers out ofsets that are actual embeddings of these but technically the Coq argument type doesnot provide this information. To get our hands on this information we first constructthe strong sum {n: nat | iter n ordS ordO = N}, which is a form of informativeexistential quantification. This states that there exists a natural number n whichyields the embedding N. This is what we expect, but N could have come fromanywhere else, in which case the sum type would be empty. We can now performa case analysis on the inhabitance of the constructed sum with the help of theassumed decision procedure outlined in Section 2.5. When the sum is inhabitedand we were given a sensible set N, we are also given the witness n as well as aproof of the body of the sum. We simply return the witness and ignore the proof.If the sum turns out to be empty we just default to 0. We require this branch of thecase split to have a fully defined function, but the reader should be aware that inthe remainder we never expect to actually enter it.

Before we can show that these two functions yield an isomorphism we requirethe following inversion lemma of the embedding.

Lemma 22. The embedding of the natural numbers n and m is injective, formally

embed n = embed m→ n = m.

Proof. We proof this by induction on n and a case analysis on m, yielding fourcases.

The first case is trivial since 0 = 0. For the second case we have embed 0 =embed (S m′) which is equivalent to ordO = ordS (iter m′ ordS ordO) . This iscontradictory as we have seen in Corollary 3.

We now look at the inductive step. When m = 0 we have the same contradictionas in the second case so we are left with the following scenario. Our inductionhypothesis states ∀m : N, embed n = embed m → n = m and we also knowfor some m′ that ordS (embed n) = ordS (embed m′). What we have to showis S n = S m′. For the latter it is sufficient to show n = m′. Lemma 21 allows

59


us to conclude embed n = embed m′. The desired result then follows from ourinduction hypothesis (with m instantiated to m′), completing the proof. �

We now show that embedding and projection form an isomorphism,

Lemma 23. The function proj ◦ embed is the identity at N, formally

∀n : N, proj (embed n) = n.

Proof. We unfold the definition of proj and perform a case split on the underly-ing inhabitance test. In the positive case the function collapses to the witness k,reducing the goal to k = n. While the function ignored the embedded proof of theequality iter k ordS ordO = embed n, we still have that knowledge as an assump-tion from the case analysis. From this we conclude the desired result with the helpof Lemma 22.

If the sum is empty however, it is sufficient to show inhabitance and produce acontradiction. This means that we have to find a k such that iter k ordS ordO =

embed n. This is simply n, completing the proof. �

The shape of the proof we have just seen illustrates the dummy role of the secondbranch of the projection function. We split on whether the particular strong sumwas inhabited or not. In the first case the proof could go through cleanly. Assumingthe sum was not inhabited led to a contradiction.

Lemma 24. The function embed ◦ proj is the identity on sets encoding finiteordinals, formally

∀N ∈ FinOrd, embed (proj N) = N.

Proof. From the definition of FinOrd we know that there is a natural num-ber k such that N = embed k. Substituting for N reduces the claim toembed (proj (embed k)) = embed k. Using Lemma 23 we reduce this to thetrivial embed k = embed k. �

We complete this section with a recursion principle for our finite ordinals thatis closely modelled on the recursion principle we have for Coq’s natural numbers.There are a number of properties of our projection and embedding functions thatare used in the correctness proof of this principle. We state these here without proofbut the inclined reader can find the details in the corresponding proof scripts.

proj ordO = 0 (4.16)

N = embed n→ ordS N = embed (S n) (4.17)

N ∈ FinOrd→ proj (ordS N) = S (proj N) (4.18)

N ∈ FinOrd→ 0 = proj N → N = ordO (4.19)

N ∈ FinOrd→ S m = proj N → ∃M ∈ FinOrd, N = ordS M (4.20)

60


We pose the following definition where nat rect is the recursion principle forCoq’s type N with the following type.

nat rect : forall P : nat → Type,P 0→ ( forall n : nat, P n→ P (S n)) → forall n : nat, P n

Definition FinOrd rect (z f N: set) : set Bnat rect (λ . set) z

(λ n: nat . λ R: set . ap (ap f (FinOrd embed n)) R ) (FinOrd proj N).

The mathematical notations for these two recursion principles are rectnat andrectord respectively.

This definition is designed to outwardly appear to only operate on sets so wehave to use our function encoding from Section 4.6 to represent and use the binarystep function f. Its first argument is the current natural number, while the secondrepresents the recursive argument. The actual recursion is performed with Coq’snatural numbers. Since the numerical argument is an ordinal though we first haveto project out the corresponding number. Then at every step we are given a naturalnumber which we have to embed into the ordinals to provide the first argument tof in the correct format.

Our recursion principle comes with suitable computation rules that are easy toshow with the properties of our projection function mentioned above.

rectord z f ordO = z

N ∈ FinOrd −→ rectord z f (ordS N) = ap (ap f N) (rectord z f N)

For the first we simply unfold the definition of rectord. This reveals an occur-rence of proj ordO which we know to be equal to 0. The instance of rectnat thenjust collapses down to z as required. For the second we again unfold rectord onthe left and then exploit (4.18) to pull the ordS through the projection operation.Applying the computation rules of rectnat once leads to the desired equality.

Lastly we show a typing result for our recursor on ordinals and thus completethe correctness of the construct.

Theorem 16. For any unary metafunction A on sets and encoded ordinal N ∈FinOrd, and for any z ∈ A ordO and f ∈ ΠN ∈ FinOrd, A N −→A (ordS N) thedefinition of rectord satisfies

rectord z f N ∈ A N.

Proof. In principle we would like to prove this result by induction on N but wedo not even have a suitable induction principle for our ordinals. The use of rectnat

in the definition of rectord however suggests that we may delegate the induction toCoq’s natural numbers instead. Unfolding the definition of rectord yields

rectnat (λ .set) z (λn.λR.ap (ap f (embed n)) R) (proj N) ∈ A N.

61


Before we can perform the induction we have to represent N as the corres-ponding natural numbers and we have to take care of both occurrences. SinceN ∈ FinOrd we can expand the second occurrence of N to embed (proj N). Wenow have two occurrences of proj N which we are going to call k. Our goal now is

rectnat (λ .set) z (λn.λR.ap (ap f (embed n)) R) k ∈ A (embed k)

and we can finally perform the induction on k.The base case is fairly straightforward. For k = 0, the instance of rectnat col-

lapses down to simply z, reducing the goal to z ∈ A (embed 0) or equivalentlyz ∈ A ordO, which is one of our assumptions.

The step with k = S k′ is more involved. The induction hypothesis is

rectnat (λ .set) z (λn.λR.ap (ap f (embed n)) R) k′ ∈ A (embed k′)

and our goal reduces to

ap (ap f (embed k′))

(rectnat (λ .set) z (λn.λR.ap (ap f (embed n)) R) k′) ∈ A (embed (S k′))

At this point we recall that f ∈ ΠN ∈ FinOrd, A N −→A (ordS N) and clearlyembed k′ ∈ FinOrd. We apply f to (embed k′) and obtain (ap f (embed k′)) ∈A (embed k′) −→A (ordS (embed k′)). We can push the one occurrence of ordS

into the embedding and then apply the result to the induction hypothesis, yieldingthe final result. �

We are still missing one fact to be sure that our construction of FinOrd can serveas the interpretation of the inductive type N, namely analysis by cases. In our casethis means that we know that a set in FinOrd is always either ordO or ordS M forsome M ∈ FinOrd.

Theorem 17. Every set N ∈ FinOrd is a well-formed ordinal, i.e.,

N = ordO ∨ ∃M ∈ FinOrd, N = ordS M.

Proof. Since N ∈ FinOrd we know that we have an n : N such that N =

(iter n ordS ordO). We can now perform a case analysis on n. For n = 0 the iteratorreduces to ordO, in which case the goal holds. If instead we have n = S m for somem : N. This entails that N = (iter (S m) ordS ordO) = ordS (iter m ordS ordO).We will now prove the right disjunct and pose (iter m ordS ordO) as the re-quired witness. We have just established that the requested equality holds, while(iter m ordS ordO) ∈ FinOrd follows from (4.15), completing the proof. �

With the knowledge that we will be able to interpret basic inductive types weconclude the presentation of the technical development. As mentioned in the be-ginning and throughout the text, any technical omissions and further details can befound in the proof scripts available on the author’s web page8.

8http://www.ps.uni-saarland.de/˜jkaiser/

62

http://www.ps.uni-saarland.de/~jkaiser/

CHAPTER 5

Conclusion

In this work we have presented the formal development of a set theory. We beganwith an introduction to the metatheory, a considerably strengthened variant of theCalculus of Constructions, equipped with classical reasoning and a powerful choiceprinciple. On top of this foundation we have layered an axiomatisation of Tarski-Grothendieck set theory and then developed the set theory to include a number ofinteresting constructs. The most prominent among these is likely our developmentof typable functions and function spaces.

We would like to point out the development presented here is not a general pur-pose set library. Consider for example the absence of binary intersection or setdifference. Instead, what we present here is a tool box developed with a particularpurpose in mind. As with any formal development there is always the questionabout how to set up the intermediate lemmas and, more generally, how to structurethe overall development. Depending on the overall goal there are a number of waysto address these issues. We do not claim that the approach chosen here is optimalwith respect to our plans but it is the result of several iterations and so far it appearsto have reached a stable point that can hold up to our needs.

We have omitted a number of smaller intermediate constructions. The readerwould not have gained additional insights from these but they were necessary non-etheless. We want to mention one of these here, namely strong sums and dependentordered pairs, which will become important in the long run. The interested readercan find the development in the Coq source files.

This development can be seen as a larger exercise in formal classical reasoning.Up to this point, the inclusion of the law of the excluded middle did not lead toground breaking short cuts or improvements in the formal derivations but all in

63

Chapter 5. Conclusion

all it appears that the proofs could be kept reasonably small and at a level thatis easy to grasp. It is rather unclear whether this would have been possible ina purely intuitionistic setting. Constructions which benefited in particular werethe separation or restricted comprehension over sets, as well as the constructionof ordinals and the isomorphism to Coq’s natural numbers. The latter relied onthe decidability of inhabitance of every type, which in turn is a consequence ofclassical reasoning in the presence of choice.

5.1 Related Work

Before we close we want to briefly discuss other results in the field of this work.First and foremost is the work of Barras who recently constructed a formalised,

intuitionistic model of CCω (which is ECC without strong sums) [Bar10] and hisrecent extensions to CiC [Bar12]. He constructed the model in Coq as well and ourwork was largely motivated by his developments. In contrast to his work we are ofcourse classical which at times allows us to reason more directly.

With respect to the formalisation of set theory in general we of course have tomention Paulson’s extensive work on Isabelle/ZF [Pau93, Pau95, Pau08] as well asthe Mizar project [Rud92]. The latter employs Tarski-Grothendieck set theory as itsbasic formalism, albeit axiomatised with Tarski’s Axiom A instead of Grothendieckuniverses [Try90]. In addition to these large developments there are various setlibraries available in Coq, each of which was designed with a particular applicationin mind. And of course each of the formalised model constructions we look at next,comes with its own inbuilt set theory as well.

One of the early insights with respect to the construction of set-theoretic mod-els is Reynolds’ discovery that an impredicative, polymorphic universe like Proponly admits trivial classical models [Rey84]. With respect to such proof-irrelevantmodels we have the ongoing work of Werner et al. [Wer97, MW02, LW11]. Theydiscovered that the combination subtyping and impredicativity in theories like ECCresult in unforeseen complications. Part of their work has been formalised in Coqand partial solutions have been suggested. Similar to our work, Gordon sketchesan embedding of ZFC into higher-order logic and vice versa, albeit with the goalof combining the benefits of both formalisms [Gor96]. And finally we have arigorous approach by Aczel which relates type and set theories with respect toproof-theoretic strength [Acz98].

5.2 Future Work

As we have mentioned early on, we would like to construct a formal, set-theoreticmodel for type theories. In this light the development presented here forms thefoundation of this goal. The type theories we aim for are Luo’s ECC [Luo94] andthe Calculus of (co)inductive Constructions [PPM89] underpinning Coq. We are

64

5.2. Future Work

going to use the ordinal 2 to interpret the lowest type universe Prop, which leads toproof irrelevant models, similar to the models in [Acz98] and [MW02]. With thisin mind we have chosen Aczel’s function representation which exhibits a certainclosure property for function spaces mapping into 2. This step should allow usto interpret the function spaces of the type theory directly with our set-theoreticfunctions. For the latter we proved a number of results that will closely map to thetyping rules of the target type theory. With respect to the non-propositional typeuniverses we are going to rely on our set-theoretic Grothendieck universes. As ofyet it is still unclear though which set-theoretic universe will be used to interpretwhich type universe. There seems to be a certain degree of freedom in this selectionand preliminary observations hint at a number of possible independence results wecould obtain from this.

All in all we have taken some initial steps towards the construction of an ECCmodel and there should not be too many surprises along the way. With respect toCiC it is still unclear, though, how to best deal with inductive types and in par-ticular inductive propositions. For inductive (non-propositional) types the ordinalconstruction presented here suggests at least one possible approach, while induct-ive propositions are somewhat mysterious. In most models they are tacitly ignoredand it is not clear for what reason. Given that the impredicativity of Prop is knownto complicate the construction of set-theoretic models, it appears worthwhile toinvestigate their nature.

65

Bibliography

[Ack25] Wilhelm Ackermann. Begrundung des “tertium non datur” mittels derHilbertschen Theorie der Widerspruchsfreiheit. Mathematische An-nalen, 93:1–36, 1925.

[Acz78] Peter Aczel. The type theoretic interpretation of constructive set theory.In Angus Macintyre, Leszek Pacholski, and Jeff Paris, editors, LogicColloquium ’77, volume 96 of Studies in Logic and the Foundations ofMathematics, pages 55–66. Elsevier, 1978.

[Acz98] Peter Aczel. On relating type theories and set theories. In TYPES, pages1–18, 1998.

[Bar10] Bruno Barras. Sets in Coq, Coq in sets. Formalized reasoning, 3(1),2010.

[Bar12] Bruno Barras. Semantical investigations in intuitionistic set theory andtype theories with inductive families. Habilitation Thesis manuscript,received in private communication, 2012.

[BDL06] Sandrine Blazy, Zaynah Dargaye, and Xavier Leroy. Formal verificationof a C compiler front-end. In Jayadev Misra, Tobias Nipkow, and EmilSekerinski, editors, FM, volume 4085 of Lecture Notes in Computer Sci-ence, pages 460–475. Springer, 2006.

[Ber58] Paul Bernays. Axiomatic set theory : with a historical introduction byAbraham A. Fraenkel. Amsterdam : North-Holland Publ. Co., 1958.

[Bro23] Luitzen E. J. Brouwer. On the significance of the principle of excludedmiddle in mathematics, especially in function theory, pages 334–345. Invan Heijenoort [vH00], 1923.

67

Bibliography

[Can74] Georg Cantor. Uber eine Eigenschaft des Inbegriffs aller reellen algeb-raischen Zahlen. Journal fur die reine und angewandte Mathematik,77:258–262, 1874.

[CF58] Haskell B. Curry and Robert Feys. Combinatory logic, volume 1. North-Holland Publ. Co., 1958.

[CH85] Thierry Coquand and Gerard P. Huet. Constructions: A higher orderproof system for mechanizing mathematics. In Bruno Buchberger, ed-itor, European Conference on Computer Algebra (1), volume 203 of Lec-ture Notes in Computer Science, pages 151–184. Springer, 1985.

[CH88] Thierry Coquand and Gerard P. Huet. The calculus of constructions.Information and Computation, 76(2/3):95–120, 1988.

[Fra22] Abraham A. Fraenkel. Zu den Grundlagen der Cantor-ZermeloschenMengenlehre. Mathematische Annalen, 86:230–237, 1922.

[Fre79] Gottlob Frege. Begriffsschrift, a formula language, modeled upon thatof arithmetic, for pure thought, pages 1–82. In van Heijenoort [vH00],1879.

[Gab62] Pierre Gabriel. Des categories abeliennes. Bulletin de la SocieteMathematique de France, 90:323–448, 1962.

[Gir72] Jean-Yves Girard. Interpretation fonctionelle et elimination descoupures de larithmetique dordre superieur. PhD thesis, Universite ParisVII, 1972.

[Gon07] Georges Gonthier. The four colour theorem: Engineering of a formalproof. In Deepak Kapur, editor, ASCM, volume 5081 of Lecture Notesin Computer Science, page 333. Springer, 2007.

[Gor96] Mike Gordon. Set theory, higher order logic or both? In Gerhard Goos,Juris Hartmanis, Jan Leeuwen, Joakim Wright, Jim Grundy, and JohnHarrison, editors, Theorem Proving in Higher Order Logics, volume1125 of Lecture Notes in Computer Science, pages 191–201. SpringerBerlin / Heidelberg, 1996.

[Gri90] Timothy G. Griffin. A formulae-as-type notion of control. In Proceed-ings of the 17th ACM SIGPLAN-SIGACT symposium on Principles ofprogramming languages, POPL ’90, pages 47–58, New York, NY, USA,1990. ACM.

[GTL89] Jean-Yves Girard, Paul Taylor, and Yves Lafont. Proofs and types. Cam-bridge University Press, New York, NY, USA, 1989.

68

Bibliography

[GV72] Alexander Grothendieck and Jean-Louis Verdier. Expose I: Prefais-ceaux. In Michael Artin, editor, Theorie des Topos et Cohomologie Etaledes Schemas, volume 269 of Lecture Notes in Mathematics. SpringerBerlin / Heidelberg, 1972.

[Hey74] Arend Heyting. Mathematische Grundlagenforschung IntuitionismusBeweistheorie. Springer Berlin / Heidelberg, 1974.

[HJ99] Karel Hrbacek and Thomas J. Jech. Introduction to set theory. Pureand applied mathematics. CRC Press, 3. ed., rev. and expanded edition,1999.

[How80] William A. Howard. The formulae-as-types notion of construction. InJ. Roger Hindely and Jonathan P. Seldin, editors, To H. B. Curry: Essayson Combinatory Logic. Academic Press, Boston, MA, 1980.

[Hur95] Antonius J.C. Hurkens. A simplification of Girard’s paradox. In Mari-angiola Dezani-Ciancaglini and Gordon Plotkin, editors, Typed LambdaCalculi and Applications, volume 902 of Lecture Notes in ComputerScience, pages 266–278. Springer Berlin / Heidelberg, 1995.

[Kol32] Andrej N. Kolmogorov. Zur Deutung der intuitionistischen Logik.Mathematische Zeitschrift, 35:58–65, 1932.

[Kru65] Arthur H. Kruse. Grothendieck universes and the super-complete mod-els of Shepherdson. Compositio Mathematica, 17:96–101, 1965.

[Kur21] Casimir Kuratowski. Sur la notion de l’ordre dans la theorie des en-sembles. Fundamenta Mathematicae, 2(1), 1921.

[Ler09a] Xavier Leroy. Formal verification of a realistic compiler. Communica-tions of the ACM, 52(7):107–115, 2009.

[Ler09b] Xavier Leroy. A formally verified compiler back-end. Journal of Auto-mated Reasoning, 43(4):363–446, 2009.

[Luo94] Zhaohui Luo. Computation and reasoning: a type theory for computerscience. Oxford University Press, Inc., New York, NY, USA, 1994.

[LW11] Gyesik Lee and Benjamin Werner. Proof-irrelevant model of CC withpredicative induction and judgmental equality. Logical methods in com-puter science, 7(4), 2011.

[ML75] Per Martin-Lof. An intuitionistic theory of types: Predicative part. Stud-ies in Logic and the Foundations of Mathematics, 80:73–118, 1975.

[MW02] Alexandre Miquel and Benjamin Werner. The not so simple proof-irrelevant model of CC. In TYPES, pages 240–258, 2002.

69

Bibliography

[Pau93] Lawrence C. Paulson. Set theory for verification: I. from foundations tofunctions. Journal of Automated Reasoning, 11(3):353–389, 1993.

[Pau95] Lawrence C. Paulson. Set theory for verification. II: Induction and re-cursion. Journal of Automated Reasoning, 15(2):167–215, 1995.

[Pau08] Lawrence C. Paulson. The relative consistency of the axiom of choice–mechanized using Isabelle/ZF. Logic and Theory of Algorithms, pages486–490, 2008.

[PPM89] Frank Pfenning and Christine Paulin-Mohring. Inductively defined typesin the calculus of constructions. In Michael G. Main, Austin Melton, Mi-chael W. Mislove, and David A. Schmidt, editors, Mathematical Found-ations of Programming Semantics, volume 442 of Lecture Notes in Com-puter Science, pages 209–228. Springer, 1989.

[Rey84] John C. Reynolds. Polymorphism is not set-theoretic. In Gilles Kahn,David B. MacQueen, and Gordon Plotkin, editors, Semantics of DataTypes, volume 173 of Lecture Notes in Computer Science, pages 145–156. Springer Berlin / Heidelberg, 1984.

[Rud92] Piotr Rudnicki. An overview of the Mizar project. In Proceedings ofthe 1992 Workshop on Types for Proofs and Programs, pages 311–330,1992.

[Rus02] Bertrand Russell. Letter to Frege, pages 124–125. In van Heijenoort[vH00], 1902.

[Sko22] Thoralf Skolem. Some remarks on axiomatized set theory, pages 290–301. In van Heijenoort [vH00], 1922.

[Sko29] Thoralf Skolem. Uber einige Grundlagenfragen der Mathematik.Skrifter utgitt av Det Norsk videnskaps-akademi i Oslo, I. Matematisk-naturvidenskapelig klasse, no. 4. I kommisjon hos J. Dybwad, Oslo,1929.

[Tar38] Alfred Tarski. Uber unerreichbare Kardinalzahlen. Fundamenta Math-ematicae, 30(1):68–89, 1938.

[Try90] Andrzej Trybulec. Tarski Grothendieck set theory. Journal of Formal-ized Mathematics, 1(1):9–11, 1990.

[vH00] Jean van Heijenoort, editor. From Frege to Godel: a sourcebook inmathematical logic. toExcel, Lincoln, NE, USA, 2000.

[Wag85] Stan Wagon. The Banach-Tarski Paradox. Cambridge University Press,Cambridge, UK, 1985.

70

Bibliography

[Wer97] Benjamin Werner. Sets in types, types in sets. In TACS, pages 530–346,1997.

[Wil69] Neil H. Williams. On Grothendieck universes. Compositio Mathemat-ica, 21:1–3, 1969.

[Zer08] Ernst Zermelo. Untersuchungen uber die Grundlagen der Mengenlehre.I. Mathematische Annalen, 65:261–281, 1908.

[Zer30] Ernst Zermelo. Uber Grenzzahlen und Mengenbereiche. FundamentaMathematicae, 16(1):29–47, 1930.

71

formal construction of a set theory in coq · a formalisation of intuitionistic zermelo-fraenkel...

Documents