java(x) a type-based program analysis framework

165

Upload: others

Post on 22-Apr-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Java(X) A Type-Based Program Analysis Framework

Java(X)A Type-Based Program Analysis

Framework

Markus Degen

Dissertation zur Erlangung des Doktorgrades derTechnischen Fakultät der

Albert-Ludwigs-Universität Freiburg im Breisgau

2011

Page 2: Java(X) A Type-Based Program Analysis Framework

Dekan: Prof. Dr. Bernd BeckerErstgutachter: Prof. Dr. Peter Thiemann, Universität FreiburgZweitgutachter: Prof. Dr. Olivier Danvy, Universität AarhusPrüfungsdatum: 24.06.2011

ii

Page 3: Java(X) A Type-Based Program Analysis Framework

Abstract

There is an increasing need for reliable software. Nowadays, almost every appli-cation area needs complex software: train control systems, mobile phones, powerplants, and a number more. Many of these applications have to meet strict safetyrequirements to avoid unexpected behavior.Programmers take these requirements into account during the design phase of

projects and translate them into invariants and protocols that the software needsto implement.Still, most main-stream programming languages supply only little possibilities to

express or statically enforce these constraints. Therefore, programmers have to relyon detailed documentation to state the intended use of code libraries. This alsoresults in the need for excessive, but still incomplete, testing to avoid failure.This thesis presents the static type-based program analysis framework Java(X)that

is capable of enforcing user-de�ned protocols. The framework Java(X) extends thetype language of Java with annotations drawn from an user-de�ned annotation setX. Each instance of X yields a di�erent re�nement type system with guaranteedsoundness.The thesis introduces the framework with some motivating examples, formalizes

the core language of Java(X), provides a generic soundness proof, and presents amodular type checking algorithm that also infers the optimal capability distribution.The main novelty of Java(X) is its concept of activity annotations coupled with

a notation for droppability. The activity annotations provide a capability trackingthat can grant exclusive write access to a resource and therefore allows the systemto exactly track typestate changes. The notion of droppability enables the sys-tem to prevent resources from being dropped without previously cleaning them up.Java(X) handles aliases with a novel splitting relation. To ease the use of the sys-tem, the type checker automatically infers the splitting to enable typing wheneverpossible without further user interaction. Therefore, the user only needs to providethe annotation set and intended protocol. In return, the generic type soundnessguarantees that the stated constraints are not violated.

Page 4: Java(X) A Type-Based Program Analysis Framework
Page 5: Java(X) A Type-Based Program Analysis Framework

Zusammenfassung

Es gibt einen zunehmenden Bedarf an sicherer Software. Heutzutage wird komplexeSoftware in fast allen Bereichen eingesetzt, zum Beispiel in Kraftwerken, Autopi-loten im Zug- und Flugverkehr, Mobiltelefonen und so weiter. Viele dieser Anwen-dungen müssen dabei besondere Sicherheitsanforderungen erfüllen um unerwarteteAusfälle oder Unfälle zu vermeiden.Während der Entwurfsphase für Software werden diese Anforderungen von den

Programmierern analysiert und in Invarianten und Protokolle kodiert.Leider bieten die meisten verwendeten Programmiersprachen keine oder wenig

Unterstützung um diese Protokolle aus zu drücken oder statisch zu erzwingen. Da-her müssen die Programmierer auf eine ausführliche Dokumentation ausweichenund zusätzlich die Software intensiv testen.Diese Dissertation stellt das statische Programmanalyse-Framework Java(X)

vor. Java(X) ist in der Lage solche Protokolle und Invarianten statisch zu prü-fen und deren Einhaltung zu garantieren. Java(X) erweitert das Typsystem vonJava mit Annotationen, die individuell von einem Programmierer zur Verfügunggestellt werden. Jede Annotationsmenge X führt dabei zu einer neuen Instanz eineserweiterten Typsystems. Java(X) ist mit einem generischen Soundnessbeweis au-gesstattet, so dass für jede Instanz X die Typkorrektheit des Systems sichergestelltist.Die Ausarbeitung stellt das Framework Java(X) mit einigen motivierenden Bei-

spielen vor. Zusätzlich wird ein Kernsystem formalisiert und ein generischer Beweiszur Typkorrektheit unabhängig von der jeweiligen Instanz X präsentiert. Zudemwird ein Algorithmus für einen modularen Typchecker inklusive einer optimalenInferenz für die Berechtigungen einzelner Felder vorgestellt.Die Neuerung von Java(X) liegt dabei unter Anderem in der Art und Weise, wie

Berechtigungen für Felder mittels Aktivitätsannotationen modelliert werden undgleichzeitig verhindert werden kann, dass Felder deren Inhalt noch benötigt wird,verworfen werden. Die Aktivitätsannotationen ermöglichen es Java(X) einzelnenReferenzen einen exklusiven Schreibzugri� zur Verfügung zu stellen. Damit ist esmöglich, Protokolle mittels eines veränderlichen Typstatus statisch zu prüfen. Alia-se werden dabei mit einer neuen Splitting Relation behandelt. Diese Relation stelltsicher, dass zu jedem Zeitpunkt in einem Programm immer nur eine Referenz denexklusiven Schreibzugri� besitzt. Um den Programmierer zu entlasten, wird dieseSplitting Relation von dem Typchecker jeweils so optimal inferriert, dass das Pro-gramm möglichst Typkorrekt ist. Die Möglichkeit das Verwerfen von Referenzenzu verhindern kann dazu verwendet werden, um sicher zu stellen, dass Ressourcenzuerst sauber beendet und aufgeräumt werden, bevor diese nicht mehr erreichbarsind.

v

Page 6: Java(X) A Type-Based Program Analysis Framework
Page 7: Java(X) A Type-Based Program Analysis Framework

Acknowledgments

This thesis would not have been �nished without the help and support from manypeople, and in fact, it would not have been as much interesting writing it withouttheir company, too.First, I want to thank Peter Thiemann, he raised my interest in functional lan-

guages right from the start of my studies in the �rst semester. He lured me withseveral lectures, an interesting and fun project (JDance) and his con�dence in myabilities into one of the most interesting topics in computer science. Finally, hisadvice, guidance, and patience enabled me to write this thesis.Next, Olivier Danvy has a good deal in this thesis. He managed to be there

twice at the right time when there was more pain then desire during my kicko� inprogramming languages and the completion of the proofs. And of course, everybodyshould attend one of his electrifying talks!My colleagues Matthias Neubauer, Stefan Wehr, Annette Bieniusa, and Phillip

Heidegger all had an impact. We wrote papers, discussed the latest results, drankgallons of tea and co�ee, instructed tutors and students ... Still, they did not onlyprovide me with fruitful discussions, they also took care of the plants and shareda wonderful time with me in Canada, climbing, and our department band. Thankyou all for a wonderful time.Many thanks to my parents for their everlasting support to whatever their chil-

dren are up to. Many thanks to Paola, I would still be writing this thesis withouther being right there with me, supporting me with advise, love, patience and the�nal proof reading.Finally, thanks to all my friends and to my colleagues from the emergency medical

services in Freiburg for enabling me a good time aside computer science.

vii

Page 8: Java(X) A Type-Based Program Analysis Framework
Page 9: Java(X) A Type-Based Program Analysis Framework

Contents

1 Introduction 1

1.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.3.1 Preliminaries and Notations . . . . . . . . . . . . . . . . . . . 6

1.3.2 Principles of Induction and Coinduction . . . . . . . . . . . . 7

2 Java(X) An Informal Account 13

2.1 File Handle Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.1.1 Value Annotations . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.2 Droppability . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.3 Activity Annotations . . . . . . . . . . . . . . . . . . . . . . . 16

2.1.4 Summary Value Annotation . . . . . . . . . . . . . . . . . . . 19

2.2 JDOM Type Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 19

3 Formal System 23

3.1 Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Instances of Java(X) . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3 Dynamic Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4 Static Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4 Type Soundness 43

4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

4.1.1 Free Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.1.2 Alpha Conversion . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.3 Inversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.1.4 E�ect Application . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2 Additional Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.1 Subdroppability . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4.2.2 Join Free Expressions and null in Join . . . . . . . . . . . . . 47

4.3 Coinductive De�nitions . . . . . . . . . . . . . . . . . . . . . . . . . 48

ix

Page 10: Java(X) A Type-Based Program Analysis Framework

Contents

4.4 Basic Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.5 Environmental Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . 554.6 Auxiliary Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554.7 Properties of Droppability . . . . . . . . . . . . . . . . . . . . . . . . 574.8 Typing Consumes Activities . . . . . . . . . . . . . . . . . . . . . . . 604.9 Joining Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714.10 Additional Lemmas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 794.11 Preservation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834.12 Progress . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.13 Soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

5 Modular Type Checker 1075.1 Constraint System . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1085.2 An Informal Account on Constraint Solving . . . . . . . . . . . . . . 1135.3 Constraint Solving . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1155.4 Well-Formedness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6 Extensions 1316.1 Annotation Subtyping . . . . . . . . . . . . . . . . . . . . . . . . . . 1316.2 Polymorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1326.3 Inheritance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

7 Related Work 135

8 Conclusion 1418.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1418.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

Bibliography 145

Index 151

x

Page 11: Java(X) A Type-Based Program Analysis Framework

List of Figures

2.1 Simple Automaton for the File Handle . . . . . . . . . . . . . . . . . 14

3.1 Syntax for CoreJava(X). Type Syntax in Fig. 3.3 . . . . . . . . . . 24

3.2 Lookup functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.3 Type Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4 Dynamic semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5 Dynamic semantics; dereferencing null. . . . . . . . . . . . . . . . . 30

3.6 Droppability relation. . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.7 Splitting relation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.8 Fully active types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

3.9 Well-formedness of types. . . . . . . . . . . . . . . . . . . . . . . . . 33

3.10 Subactive relation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.11 E�ect Application. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.12 Type access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.13 Type update. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.14 Auxiliaries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3.15 Typing rules for expressions. . . . . . . . . . . . . . . . . . . . . . . . 39

3.16 Typing rules for intermediate expressions. . . . . . . . . . . . . . . . 40

3.17 Typing rules for programs. . . . . . . . . . . . . . . . . . . . . . . . . 40

4.1 De�nition of ρ�, the Subdroppable Relation . . . . . . . . . . . . . 47

4.2 De�nition of JF. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3 De�nition of null-free-join . . . . . . . . . . . . . . . . . . . . . . . . 48

4.4 Initial type relations . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.5 Type relations with new variables . . . . . . . . . . . . . . . . . . . . 101

5.1 Extended type syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . 108

5.2 Constraint syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

5.3 Encoding and method lookup. . . . . . . . . . . . . . . . . . . . . . . 111

5.4 E�ect application and environment equivalence. . . . . . . . . . . . . 111

5.5 Constraint Generating Expression Typing. . . . . . . . . . . . . . . . 112

xi

Page 12: Java(X) A Type-Based Program Analysis Framework

List of Figures

5.6 Variable access or fresh variable generation. . . . . . . . . . . . . . . 1165.7 Incompatible types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1165.8 User-de�ned droppability and null. . . . . . . . . . . . . . . . . . . 1175.9 Constraint rewriting for droppability constraints. . . . . . . . . . . . 1185.10 Constraint rewriting for e�ect application constraints. . . . . . . . . 1205.11 Constraint rewriting for �eld update constraints. . . . . . . . . . . . 1215.12 Constraint rewriting for �eld access. . . . . . . . . . . . . . . . . . . 1225.13 Constraint rewriting for full active type constraints. . . . . . . . . . . 1235.14 Constraint rewriting for splitting constraints. . . . . . . . . . . . . . 1235.15 Helper function to rewrite activity annotations (for consistent value

annotations only). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1255.16 Constraint rewriting for activity splitting constraints. . . . . . . . . . 1265.17 Constraint rewriting for null constraints. . . . . . . . . . . . . . . . 1265.18 Constraint solving for equivalence constraints. . . . . . . . . . . . . . 1285.19 Well-Formedness for the Constraint System . . . . . . . . . . . . . . 129

6.1 Subsumption rule for Java(X). . . . . . . . . . . . . . . . . . . . . . 1326.2 Type cast rule for Java(X). . . . . . . . . . . . . . . . . . . . . . . . 133

xii

Page 13: Java(X) A Type-Based Program Analysis Framework

1Introduction

In recent times, safety considerations for software have become more and more im-portant. On the one hand, more complex software is build in safety critical areas,like train control systems, steer-by-wire and auto-piloted airplanes, power plants,and so on; on the other hand, well-known areas, like mobile-phones and web appli-cations reach a wide span of programmers that participate in writing code and o�erprograms to many users. It is often crucial that the software for such applicationsdoes not crash, as failures could result in, less critical, a non-functional mobilephone, or, far more serious, power loss or severe accidents. These applicationsinvolve embedded systems, the coordination of several microcontrollers, or stan-dard computer with according programming. In addition, even smaller embeddedsystems nowadays are programmed in high-level languages like Java.

As testing is expensive and may miss some errors, there is a need for tools andframeworks to develop reliable software with guaranteed safety. Furthermore, thesetools should be easy to use and adaptable for the programmer to increase accep-tance. This thesis presents the static type-based program analysis [35] frameworkJava(X) that is capable of improving software reliability.

A programming language with a static type system eliminates some commonprogramming errors right from the beginning and rejects several potentially erro-neous programs at compile-time. For instance, the type soundness for such systemsassures that operations receive no illegal, type-incorrect, arguments. Whenever astatic type system cannot prove the absence of type errors during the compilationof a program, it rejects the program and tries to provide a meaningful error messageto the user. As these checks occur during compile-time, the type system introducesabstractions to handle these checks statically. Due to such abstractions, static typesystems may reject some programs that would run without type errors.

However, there remain many properties which, even though apparent at compile-time, cannot be checked using standard type systems. For example, taking the headof an empty list and other null-pointer exceptions cause a run-time error, even if

1

Page 14: Java(X) A Type-Based Program Analysis Framework

1 Introduction

some of them are already observable during compile-time.A related problem arises from non-trivial object life cycles [37]. Many objects

progress through distinct states during their lifetime where state changes are causedby method calls. In some states, certain methods are disabled and calling themcauses a run-time error. A standard type system cannot avoid such run-time errorsbecause it is not aware of the evolving object states and the intended protocols. Tocheck such properties, the type system needs either some extra intentional informa-tion or some re�ned information about the use of a resource.Enhancing a type system to track these state changes is not straightforward be-

cause it requires assigning the same variable di�erent types throughout the program.In addition, such protocol checking causes problems in the presence of aliases thatpotentially keep obsolete type assumptions. The main challenge here is to keeptrack of aliasing to the extent that the change of types is possible without losingexact state information.A type system with additional structure can supply the information needed to

track the resources such that the system is aware of state changes throughout theprogram. Type re�nement restricts the semantics of programs by incorporatingexplicit tests for predicates that re�ne the underlying types [23, 50]. The typesoundness property for such an extended system becomes more expressive becauseit guarantees that these predicates are always satis�ed. Further, an extension ofre�nement typing with accurate state tracking is provided, for example, by a lineartype system [47]. Unfortunately, most systems do not provide a seamless integra-tion, let alone migration, between standard types and linear types.The framework Java(X) provides a �exible way to perform type-based protocol

checking. It is capable of statically tracking resources and their aliases and therebyshows that user-de�ned safety conditions are not violated throughout the program.Java(X) introduces a family of annotated type systems. Annotated type systemsextend the type language of some existing system with annotations. These anno-tations can be seen as an pluggable optional type system [10]. Java(X)'s valueannotations restrict the meaning of the corresponding type and thus enable thetype soundness proof to express additional properties. As re�nements are domainspeci�c, they are not hardwired into the system. Instead, Java(X) is parameterizedover a partially ordered set X of value annotations. This annotation set is de�nedby the re�nement designer, who can easily provide a di�erent annotation set forevery resource he wants to track with Java(X).The annotations in Java(X) are �ow-sensitive and change for an object through-

out the program. To enable a change of these annotations throughout a programand to track the resources, the re�nement designer annotates methods to restrictthe call of the according method to resources that carry certain annotations andtherefore ful�ll the needed preconditions of this method. In the same way, thedesigner states that a method changes the annotations and therefore, that it con-

2

Page 15: Java(X) A Type-Based Program Analysis Framework

sumes some resources or changes their state. It su�ces if the re�nement designerannotates the providing interface, that is, method signatures, to track the resourcesand provide a possibility to check the intended protocol of an object.

An alternative approach might rely on logical formulas which state the intendedproperties. For Java(X), the designer chooses a set of predicates on objects andabstracts them to a partially ordered set X. This set can be tailored to the needsof a particular application domain. Thus, the annotations correspond to domainspeci�c, shrink-wrapped combinations of predicates that are lightweight and readyto use for the programmer, who has to understand the annotations but does nothave to be an expert in logics.

In addition to the value annotations, Java(X) has a built-in notation for capabil-ity tracking. The �ow-sensitive tracking of capabilities is independent of the chosenvalue annotation set in the annotation set X. To track the capabilities, we introduceactivity annotations for every �eld of a type. These activity annotations are closelyrelated to ownership types [5] and grant exclusive write permissions and thereforea strong update [11] for a �eld. There are three di�erent activity annotations:

• When an object has an active annotation for a �eld (and potentially subse-quent �elds), the program may update the �eld through this reference. Theactive activity annotation indicates that this reference has detailed state infor-mation. To this end, every active annotation carries an additionally attachedvalue annotation.

• On the other hand, inactive �elds may not be updated and provide only ageneralized summary information about their state via the user-de�ned valueannotation.

• An additional third activity annotation, semiactive, provides a possibility toexit the exact tracking of resources. This annotation allows updates fromany reference at the cost of having no exact state information about thecorresponding �eld. Resources with a semiactive annotation implement thesame semantics as resources in Java.

Java(X) ensures without further user interaction that there exists at most oneactive reference to any �eld at any location in the program. This property isessential to provide strong update without losing the exact state information of theobject in the presence of aliases. The last, semiactive, activity annotation enablesthe user to focus on safety critical resources as he may ignore irrelevant �elds andobjects.

Beyond the update privilege, an active capability carries the most accurate valueannotation for the current contents and status of the object. Hence, active capabil-ities are well suited for typestate changes. An update only changes the �eld type

3

Page 16: Java(X) A Type-Based Program Analysis Framework

1 Introduction

for the access path with the active capability. The types of the other aliases do nothave to change because they have su�ciently less accurate information and are notaware of any change in the object state.

Thus, active capabilities enable the re�nement designer to track object statechanges in reaction to method invocations. However, as described up to now,the user may discard any reference at any time, possibly losing the unique activereference. Though some protocols include that a resource has to get cleaned up inadvance of being dropped.

For this reason, Java(X) includes the notion of droppability. The re�nementdesigner declares certain states (that is, subsets of value annotations) as droppable.If an object is in a droppable state, then its reference can be discarded regardlessof its capabilities. In e�ect, an object that is not in a droppable state must notbe discarded. This enables the re�nement designer to enforce cleanup of certainobjects. Whenever there is no cleanup needed and the user is allowed to drop theobject any time, the set of droppable states is equal to the complete state set. Anobject may switch between droppable and nondroppable states during its life time.For example, consider a �le handler that must not be discarded as long as it is open(see Section 2.1). A programmer may open and close the handler arbitrary times,but only drop the active reference when the handler is closed.

Any programmer who uses a Java(X) library that has been annotated by a re-�nement designer, writes normal Java code. The programmer himself does not needto provide any annotations. Java(X) checks all protocols that are implemented bythe re�nement designer and provides useful error messages for the programmer assoon as he violates a protocol or makes illegal use of aliases that cannot be tracked.

1.1 Contributions

This dissertation is based on joint work with Peter Thiemann and Stefan Wehr [14]and mainly extends it with a complete soundness proof and a constraint system toperform a modular type inference on the distribution of capabilities.

Java(X) is an extension of Java 1.41 with a parameterized annotated type sys-tem. Its annotations are drawn from a partially ordered set X of value annotations.

The main contributions of this dissertation are listed in the following:

• We introduce, motivate, and describe Java(X) with its features. These in-clude

� a concept of an activity annotation as a capability for updating a �eldin an object. These activity annotations enable the properties described

1Throughout this dissertation, Java refers to Java 1.4

4

Page 17: Java(X) A Type-Based Program Analysis Framework

1.2 Outline

by the value annotation set X to be tracked accurately and facilitatetypestate change.

� alias handling with a novel splitting relation. This relation splits thecapability for a resource between di�erent access paths to it on a per-�eld basis.

� a facility to prevent resources from being dropped. This enables enforce-ment to clean up objects before they run out of scope or get ulteriorlost.

• We present a fully formalized subset CoreJava(X) with a parameterizedsoundness proof for the small step semantic of CoreJava(X). The proof hasto meet two challenges: First, in addition to standard progress and preserva-tion, we have to prove that the invariants and properties of Java(X) hold.The second challenge arises from the coinductive type syntax which has agreat impact and demands for coinductive proofs of the properties of thesetypes. Once a re�nement designer supplies a new partially ordered annotationset X, a programmer can immediately take advantage of the new invariantsguaranteed through these annotations.

• We introduce a constraint system including a constraint solving algorithm toperform a modular inference of the capability distribution and type checking.The main item here is that the modular inference is capable of determiningthe optimal, that is the most permissive, distribution of capabilities.We only infer the optimal capability distribution and not the principal typein general, as this could bypass the user intended restriction on types.

• We have built a proof-of-concept implementation2 of a modular type checkerfor CoreJava(X). According to the constraint system, this type checker iscapable of automatically inferring the optimal distribution of capabilities toall aliases in order to obtain a successful typing for each method.

1.2 Outline

Next, we provide the mathematical notations and background for the rest of thisthesis. Chapter 2 motivates and introduces Java(X) with some simple examples.The following Chapter 3 provides the formal system for CoreJava(X), includingthe dynamic and static semantics. Next, Chapter 4 proves that the presented systemis sound. In addition to standard soundness, this chapter also shows that the statedproperties of CoreJava(X) hold. We continue in Chapter 5 with the presentation

2http://proglang.informatik.uni-freiburg.de/projects/access-control/

5

Page 18: Java(X) A Type-Based Program Analysis Framework

1 Introduction

of a constraint system for a type checker that also infers the capability distributionto reduce the time and e�ort a user has to put into the use of Java(X). Chapter 6provides the extensions for polymorphism and inheritance, which we omitted inCoreJava(X) to avoid cluttering of the proof and to keep the proofs manageable.Chapter 7 presents the related work that in�uenced the development of Java(X).Finally, Chapter 8 concludes and discusses some future work that remains openthroughout this thesis.

1.3 Background

1.3.1 Preliminaries and Notations

Sets Let A be a set, then P(A) denotes the power set, the set of all subsets of A.We use A ⊆ B to state that A is a subset of B and A ⊇ B that A is a superset ofB. The standard notations A∪B, A∩B, and A \B denote the union, intersectionand di�erence of sets. ∅ represents the empty set; A × B := {(a, b)|a ∈ A, b ∈ B}denotes the Cartesian product of A and B.

Multisets Contrary to sets, multisets may contain the same entry several times.We use the notation {| . . . |} to build multisets.

Sequences We use the notation xi for the ordered sequence x1, . . . , xn where nusually arises out of the context. Where needed, we manually restrict the setof indices, for example xi

i∈{2,...,n}, or we state connected indices like (xi, yi) ≡(x1, y1), . . . , (xn, yn). The empty sequence is denoted by ε. Further, we omit i andsimply write x whenever i is not important.

Relations A binary relation R ⊆ A×A is re�exive if for every a ∈ A it holds that(a, a) ∈ R. A relation R ⊆ A× A is symmetric if for every (a, b) ∈ R it holds that(b, a) ∈ R. On the opposite, the relation is antisymmetric if for all (a, b) ∈ R witha 6= b it holds that (b, a) /∈ R. The relation is transitive if, whenever (a, b) ∈ R and(b, c) ∈ R hold, then (a, c) ∈ R holds, too.A partial order ≤ for set X, denoted (X,≤), is a re�exive, antisymmetric, and

transitive binary relation over a set X. A subset Y of a given partially ordered set(X,≤) is upward closed if

∀x∀y [x ≤ y ∧ x ∈ Y ⇒ y ∈ Y ]

In the same way, a subset Y of a given partially ordered set (X,≤) is downwardclosed if

∀x∀y [x ≤ y ∧ y ∈ Y ⇒ x ∈ Y ]

6

Page 19: Java(X) A Type-Based Program Analysis Framework

1.3 Background

Functions For sets A and B we write A → B for the sets of functions from Ato B. The notation f : A → B indicates that f is a function with domain A andrange B. A total function f : A → B is well-de�ned, that is, for every a ∈ A wehave f(a) = b for an unambiguous b ∈ B.

1.3.2 Principles of Induction and Coinduction

Most proofs in this dissertation rely on the principle of induction or coinduction.While �nite structures can, in general, be provided by an inductive de�nition andproperties of relations over these structures can be proved by well-known induction,in�nite structures have to be de�ned coinductively and properties therefore haveto be proved by coinduction. This Section provides an introduction on inductivelyand coinductively de�ned structures, relations and proofs. It is based on severalsimilar presentations in the literature [32, 36].

Throughout this Section we assume a �xed universe U as domain. First, most ofthe De�nitions and theorems rely on monotone functions.

De�nition 1.1 (Monotone Function). A function F ∈ P(U) → P(U) is monotone

if X ⊆ Y implies F (X) ⊆ F (Y ).

The principles of induction and coinduction rely on the de�nitions of closed sets,consistent sets, and �xed-points of a relation.

De�nition 1.2. Let F be a function and X ⊆ U , then

1. X is F -closed if and only if F (X) ⊆ X

2. X is F -consistent if and only if X ⊆ F (X)

3. X is a �xed-point of F if and only if F (X) = X

De�nition 1.3 (Least / Greatest Fixed-Point). Let F be a function and X be a

�xed-point of this function, then

• X is the least �xed-point of F if for all �xed points Y of F it holds, that

X ⊆ Y

• X is the greatest �xed-point of F if for all �xed points Y of F it holds, that

Y ⊆ X

Now, the theorem of Knaster-Tarski provides the least and greatest �xed-pointfor a monotone function.

Theorem 1.4 (Knaster-Tarski [45]). Let F be a monotone function.

1. The intersection of all F -closed sets is the least �xed-point of F .

7

Page 20: Java(X) A Type-Based Program Analysis Framework

1 Introduction

2. The union of all F -consistent sets is the greatest �xed-point of F .

De�nition 1.5. For a function F , we denote the

1. least �xed-point⋂{X | F (X) ⊆ X} with µF . It is the set inductively de�ned

by F .

2. greatest �xed-point⋃{X | X ⊆ F (X)} with νF . It is the set coinductively

de�ned by F .

With these de�nitions and the Knaster-Tarski Theorem we are able to providethe theoretical basis for the induction and coinduction principle.

Corollary 1.6.

1. Principle of induction: If X is F -closed, then µF ⊆ X.

2. Principle of coinduction: If X is F -consistent, then X ⊆ νF .

De�nition 1.7 (Generating Function). A generating function is a monotone func-

tion.

Remark 1.8. The monotonicity of the generating functions is crucial, still, through-out this thesis we only have functions that are trivially monotone. Therefore, weomit the monotonicity in all following proofs.

Example 1.9. Consider a �xed universe U that we get by the following disjointunion of the natural numbers with in�nity and zero

U = {0} ∪ N ∪ {∞}

and a plus operator + with the standard interpretation including ∞+ 1 =∞.

Now, let us consider the generating function N : U → U with

N(R) = {0} ∪ {i+ 1 | i ∈ R}

Its inductive interpretation µN is the set of natural numbers with zero, N0. Thecoinductive interpretation νN yields the natural numbers with in�nity, N0

∞. Todemonstrate the coinductive proof principle, we prove that ≤ is transitive forN0∞, as a simple example. First, we de�ne ≤ for the greatest �xed-point by a

generating function M≤ : U × U → U × U with

M≤(R) = {(0, n) | n ∈ N0∞} ∪ {(n+ 1,m+ 1) | (n,m) ∈ R}

8

Page 21: Java(X) A Type-Based Program Analysis Framework

1.3 Background

As we are interested in the coinductive case, we have to use the greatest �xed-point, that is, here we have to show that ≤ is equivalent to νM≤. Pierce [36]and others de�ne another generating function for the transitivity:

TR(R) = {(a, c) | ∃b : (a, b) ∈ R ∧ (b, c) ∈ R}

With this de�nition, a relation M is transitive if it satis�es TR(M) ⊆M , thatis, it is TR-closed. That is, for our example they have to prove TR(M≤(R)) ⊆M≤(TR(R)) for any R, which yields by [36, Lemma 21.3.6] that transitivity of≤ holds for νN . Instead of proving this generalized property, we directly de�nethe transitivity specialized for the given relation, here νM≤. This approachyields

TR = {(a, c) | ∃b : (a, b) ∈ νM≤ ∧ (b, c) ∈ νM≤}

This second de�nition uses the known de�nition of≤. To show that this directlyde�ned property, TR, holds for ≤ over νN , it su�ces by Corollary 1.6 to showthat TR is M≤-consistent, that is, we simply have to prove TR ⊆M≤(TR):

Proof. Let arbitrary (a, c) ∈ TR. Then, by de�nition of TR, there exists b suchthat

(a, b) ∈ νM≤

(b, c) ∈ νM≤

First, we assume by case distinction that neither a, b, nor c is equal to zero.As none of the variables is zero, we get by the de�nition of M≤ that

(a− 1, b− 1) ∈ νM≤

(b− 1, c− 1) ∈ νM≤

Then, by de�nition of TR, we have

(a− 1, c− 1) ∈ TR

Recall that M≤(R) = {(0, n)} ∪ {(n + 1,m + 1) | (n,m) ∈ R}. Therefore, byde�nition of M≤

(a, c) ∈M≤(TR)

As the choice of (a, c) is arbitrary, we have TR ⊆M≤(TR). The cases for a, b,or c equal to zero are trivial. With Corollary 1.6, it holds that ≤ is transitivefor N0

∞.

9

Page 22: Java(X) A Type-Based Program Analysis Framework

1 Introduction

This example illustrates how most of the later coinductive proofs in this thesis arecarried out. The proof schema is to unroll the de�nitions, show the property and theresulting parts, and pack them together again to show that the consistency for theproperty (here TR ⊆M≤(TR)) holds. Basically this course of action corresponds tothe use of the induction hypothesis in standard inductive proofs. Once understood,this technique makes coinductive proofs quite intuitive. Another advantage of thisapproach is that the direct de�nition of the property without generating functionallows us to easily state properties that combine several relations, which even mayhave di�erent arities and therefore makes it di�cult to come up with a correctgenerating function in the �rst place, as we need it for the �rst approach.

Notation We omit existential quanti�ers whenever no ambiguities arise. That is,we write

TR(R) = {(a, c) | (a, b) ∈ R ∧ (b, c) ∈ R}

instead of

TR(R) = {(a, c) | ∃b : (a, b) ∈ R ∧ (b, c) ∈ R}

Notation In our proofs we use the notation (∀i) to state that the following equationshold for all i out of the index set, which usually arises out of the context.

Inference System For readability, we present most relations by an inference sys-tem. An inference system is de�ned by a set of inference rules. An inference rule

is an ordered pair (P, c) where P is a potentially empty set of premises P1 . . . Pn

and c is the conclusion. We write the rules as follows:

P1 . . . Pn

c

The intuitive interpretation of such a rule is that whenever all prerequisites P1 . . . Pn

are satis�ed, we may infer c. In addition, we use the inference rules to introducereadable notations.

Example 1.10. The following inference rules may be used to present the generat-ing function N(R) for the natural numbers from Example 1.9:

0 ∈ Na ∈ N

a+ 1 ∈ N

The left rule has no prerequisites. If we interpret the system inductively, weobtain µN , for the coinductive interpretation we get νN .

10

Page 23: Java(X) A Type-Based Program Analysis Framework

1.3 Background

Similarly, the inference system for ≤ can be written as

0 ≤ nn ≤ m

n+ 1 ≤ m+ 1

Amore detailed introduction to induction and coinduction, including some formalaccounts on the equivalence of the inference system and the set-based notation ofrelations is provided by Leroy and Grall [29].

11

Page 24: Java(X) A Type-Based Program Analysis Framework
Page 25: Java(X) A Type-Based Program Analysis Framework

2Java(X)An Informal Account

This chapter introduces the features of Java(X) with two examples. The �rstone, a �le handle, introduces the mode of operation bit by bit. The second one,the analysis of a property of JDOM, presents a more realistic example with someadditional aspects.

2.1 File Handle Analysis

The �le handle class has two methods: close() to close the �le handle properlyand read() to access the �le. In addition, it provides a constructor which, givena �le name as String, creates an open �le. We model the �le handle in Java. Forbrevity, we only present the method signatures:

class File{

File(String name){ ... } // constructor

String read() { ... }

void close(){ ... }

}

The correct life cycle generates an open �le handle, after that the �le may beread several times, and �nally the �le handle closes the �le. The protocol of the �lehandle states that it must not read from a closed �le. This property may be statedin the documentation, however, a standard type system is not able to enforce thisprotocol during compile-time.

13

Page 26: Java(X) A Type-Based Program Analysis Framework

2 Java(X) An Informal Account

Figure 2.1 Simple Automaton for the File Handle

Open Closed

close()read()

new(. . . )

The constructor returns an open �le. Given this class de�nition, a programmermay use the �le handle. Still, whenever he violates the protocol stated above, Javaraises a run-time exception:

File f = new File("log.txt") ; // creates an open file

f.read() ; // accesses the file

f.close() ; // closes the file

f.read() ; // run-time exception!!

The static type system of Java is not aware of any state of f and hence has nopossibility to prevent such run-time errors.

A closer look at the File protocol yields the simple automaton of Figure 2.1.Here, the state transitions between Open and Closed are labeled with the methodnames.

2.1.1 Value Annotations

Java(X) introduces annotations to model such protocols. The annotation set can beindividually de�ned by the re�nement designer. The re�nement designer providesthe upcoming annotation set and method signatures. This �le access example ismodeled with a simple annotation set extracted from the automaton or directlyfrom the description. We choose the value annotation set XFStat = P({O,C}) forthe �le access example, where O stands for the above state Open and C for thestate Closed. We abbreviate {O} to O,{C} to C, and {O,C} to OC to increasethe readability. The formal system that we introduce in Chapter 3, requires theannotations to be attached to a �eld. While this choice is not essential as we couldalso attach the annotations directly to the class itself, it enables the system totrack di�erent �elds with di�erent annotations and facilitates the other featuresof Java(X). Here, we can attach these annotations to any �eld of the class, for

14

Page 27: Java(X) A Type-Based Program Analysis Framework

2.1 File Handle Analysis

the example, we explicitly add a status �eld with type FStat to the class File.We attach the above annotation set XFStat to this �eld to make Java(X) awareof the intended protocol. Next, the re�nement designer has to adapt the methodsignatures:

class File {

FStat status ;

File{status : 〈M(O), FStat〉}(String name) { ... } // constructor

String [File{status : 〈M(O), FStat〉}] read() { ... }

void [File{status : 〈M(O) M(C), FStat〉}] close() { ... }

}

The system Java(X) extends the Java-type. A type in Java(X) contains theJava-type and a �eld map, with annotated types for the �elds. The type

File{status : 〈M(O), FStat〉}

denotes a File whose �eld status has a �eld type with annotation M(O) andunderlying Java-type FStat. For now, we ignore the M and concentrate on thevalue annotation, here O.

The constructor states, that the �eld status has an O annotation at the begin-ning. Next, the read() method demands that the status �eld carries an �open�annotation whenever it is invoked. We state this requirement for the caller in squarebrackets, arguments of a method may have similar conditions. The requirement ap-plies to the invoking object and may restrict arbitrary �eld annotations. Finally,the method close() introduces a type change, which is denoted with . Here,close() requires the status �eld to carry an O annotation prior the invocation ofthe method and changes this �eld annotation to C. Such type changes only applyto the annotations, the underlying Java-type remains always unchanged.

Throughout the examples, we abbreviate some of the type syntax for readability.In the formal system, the method signature, which de�nes a change of the type, hasto duplicate the full type. Additionally, the annotations induced by the constructorare de�ned separately and may depend on the location where the object is created.This enables tracking of di�erent protocols for a single �eld, for example a read-only�le handle versus a normal one. We refer to this issue in more detail at the end ofthe �le handle example.

The code that a user of this �le handle programs, stays the same. Java(X) onlyrequires the re�nement designer of modules to provide annotations and to modifythe method signatures. The programmer himself does not even need to know the

15

Page 28: Java(X) A Type-Based Program Analysis Framework

2 Java(X) An Informal Account

details, still, he gets useful error messages whenever the protocol for such a trackedresource is violated in his code. Revisiting the above example that throws a Java-exception yields now a compiler error message:

File f = new File("log.txt") ;

// type of f: File{status : 〈M(O), FStat〉}f.read() ; // ok, O provided

// type of f: File{status : 〈M(O), FStat〉}f.close() ;

// type of f: File{status : 〈M(C), FStat〉}f.read() ; // type error: C provided, O needed!

2.1.2 Droppability

So far, we did not take into account that a �le should be closed before we discardthe according reference. In addition to the above stated protocol, the �le handlehas another important condition: Open �le handles must not be dropped. Every�le handle must be cleaned up to prevent a program from blocking resources un-necessarily. Again, the Java type system and other standard type systems do notprovide su�cient support to statically guarantee such properties. Java(X) is capa-ble of preventing a program from dropping certain variables which carry annotationsthat indicate, that the resources have not been used or have not been cleaned upyet. The re�nement designer of Java(X) speci�es a set of droppable annotationsthat are allowed to be discarded in the program. Droppability of �les is de�ned interms of a droppability predicate, ρFStat ⊆ XFStat. As an open �le must not be dis-carded, we de�ne ρFStat = {∅,C}. Whenever a variable carries an annotation thatis not part of ρFStat, Java(X) prevents the program from discarding this variable,which has not been cleaned up by the user yet. In result, Java(X) prohibits codethat does not clean up a generated �le handle. As example, we present a shortmethod that returns before cleaning up the �le handle:

String get_file_content(String name){

File f = new File(name) ;

// f has type File{status : 〈M(O), FStat〉}return f.read() ;

} // type error: dropping O annotation

2.1.3 Activity Annotations

The most urgent question that arises now, is how aliases are handled. To this end wehave another type of annotation, the activity annotation. Alias handling is managedby this additional annotation. These activity annotations prevent two di�erent

16

Page 29: Java(X) A Type-Based Program Analysis Framework

2.1 File Handle Analysis

aliases from changing the state of an object or �eld at the same time. The activityannotations range over the set {M(va),O,♦}. Only the active annotation M(va)carries a value annotation va that models the state of an object. This annotationallows a strong update for such annotated �elds. Thus the value annotation maychange with an update of the �eld. An inactive annotation O indicates that anotheralias is active and has the full write permission. Therefore, inactive �elds must notbe updated. A reference can read an inactive �eld but can not rely on the exact stateof the object. The semiactive annotation allows for unrestricted assignment but itdoes not provide the exact state as an active annotation does. For the example, wefocus on active and inactive, we will come back to semiactive later.The enclosed value annotation va in an active annotation models the state of the

object it is attached to. It is �ow-sensitive and thus di�erent program locationsmay carry a di�erent annotation for the same reference. For this simple �le handle,the activity annotation tracks whether the �le is open or closed.The above method signatures already take into account the activity annotations

as we already used M. To further illustrate the inactive annotation, we add anadditional method to the �le handle that returns the name of the �le. This methoddoes not change the state of the object, hence an inactive �eld is su�cient to returnthe �le name, stated by the signature:

class File {

...

String [File{status : 〈O, FStat〉}] getFileName() { ... }

...

}

Whenever a program accesses a variable, Java(X) splits the according type andtherefore activity annotations such that no active annotation is duplicated.This splitting is set up in a way, so that whenever we initially have an active �eld,

it splits into an active and an inactive one. Inactive (and semiactive) referencessplit into two inactive (semiactive) ones. That way we prevent that there existtwo active references to the same �eld at any program point. Active references areonly generated for new variables by a constructor. That is, whenever we have anactive �eld inside an alias, this active annotation has to be split up from the initialcapabilities.The following example illustrates the activity annotations and some of the com-

mon errors they prevent which arise from aliasing.

File f = new File("log.txt") ;

// f has type File{status : 〈M(O), FStat〉}f1 = f ; // introducing an alias

// here f : File{status : 〈O, FStat〉}

17

Page 30: Java(X) A Type-Based Program Analysis Framework

2 Java(X) An Informal Account

// and f1 : File{status : 〈M(O), FStat〉}

f1.read() ; // ok, M(O) provided

f1.close() ; // fine with active reference

f.read() ; // type error: O provided, M needed

...

Still, we may invoke the method getFileName() from any alias, as it does notneed an active reference. To enable active references to call this method, Java(X)is capable of lending capabilities and rejoining them after its use. That is, wemay split o� some activities of a reference, pass them to a method, and rejoin allcapabilities afterwards to the original reference. The type system ensures that thisdoes neither duplicate active capabilities nor discard nondroppable annotations. Inthis simple example, where we have only one annotated �eld, the system keeps theactive capability for the actual reference and passes the split-o� inactive one tothe method getFileName(). Afterwards we rejoin this inactive capability to theremaining active one and therefore keep the original active capability. In fact, theabove method invocations already made excessive use of capability lending, as theyborrowed the active capability and potentially even changed them, before rejoiningit to the remaining inactive annotation. Now, we see how this enables us to invokethe method getFileName() from both, the active and inactive reference:

File f = new File("log.txt") ;

f1 = f ; // introducing an alias

// here f : File{status : 〈O, FStat〉}// and f1 : File{status : 〈M(O), FStat〉}

f1.getFileName() ;

f.getFileName() ; // both ok, f1 is still active

f1.read() ; // ok, M(O) provided

f1.close() ; // closing ok, Mf1.getFileName() ; // uses the split-off inactive reference

The active and inactive annotations do not allow the update of a �eld throughmultiple di�erent references. As this would exclude many realistic programs, weinclude the third activity annotation, semiactive. This annotation states that thereis no unique write permission apparent and therefore allows unrestricted assignmentto the �eld. Still, as the �eld may also get updated through other references, wecan not provide any further information on the exact state of the object. In fact,semiactive behaves like standard Java-�elds: You may update them any time, butthere is no state information of the object available.

18

Page 31: Java(X) A Type-Based Program Analysis Framework

2.2 JDOM Type Analysis

Semiactive �elds also enable the incremental transition of standard Java pro-grams towards annotated ones which use the full abilities of Java(X).

It is important to notice that the user, neither the programmer nor the re�nementdesigner, needs to specify splitting anywhere in the program code. While the typesystem is able to pass the annotations to either the �rst or the second alias, theframework Java(X) is capable of e�ciently inferring the optimal, that is the mostpermissive, distribution of capabilities.

2.1.4 Summary Value Annotation

Java(X) provides another annotation to allow inactive or semiactive �elds to geta summary of possible states the object passes throughout its lifetime. This anno-tation is drawn from the value annotation set and attached to the Java-type. Itdoes not change throughout the program.

We omit this additional value annotation throughout the examples to avoid clut-tering. Adapted to the �le handle, the summary annotation can provide a possibilityto generate �les that are read-only, write-only, or both. So far, the summary anno-tation always states that the �le passes both possible states O and C, that is, thesummary annotation is OC.

To this end, we expand the above introduced value annotation set with R andW for read-only and write-only �les. As soon as the programmer provides theaccording method signatures, there exist two more summary annotations for �les:RC and WC. Now, even inactive �elds may �gure out whether the �le is openedin read-only, write-only or full-access mode. Of course, the method signatures mayguarantee that no �le handle with annotation R writes its �le.

2.2 JDOM Type Analysis

In previous work [46], Thiemann proposed a type system for DOM. This earlierwork is limited in the respect that it can only track one state of a resource. That is,the resource is either in one special state, or not. Real tracking of state change istherefore not possible. This work inspired the upcoming example that we presentanalog to our paper [14]. While it does not introduce new features of Java(X), itprovides a more realistic example and once more demonstrates the mode of opera-tion of Java(X) from another point of view.

JDOM1 is a popular Java API for manipulating XML. It views a XML documentas a tree composed of nodes of types like Element and Attribute. Each node(except the root) has a parent �eld p indicating the element that it is attached to.JDOM's Element type provides a number of operations for manipulating the tree

1http://www.jdom.org

19

Page 32: Java(X) A Type-Based Program Analysis Framework

2 Java(X) An Informal Account

structure. The method Element setAttribute(Attribute attr), which attachesan attribute node to an element node, serves as a typical example.JDOM informally imposes a number of invariants on its XML representation. One

of them is that �JDOM nodes may not be shared�. JDOM enforces this invariantdynamically by checking a detachment property : If the attribute node has a non-null parent �eld then the setAttribute method throws an IllegalAddException.This exception occurs in the last line of the following example because it attemptsto attach the node attr a second time.

Element p1 = new Element("a");

Element p2 = new Element("a");

Attribute attr = new Attribute("href", "http://www.jdom.org");

p1.setAttribute(attr); // consumes attr; now attached

p2.setAttribute(attr); // raises IllegalAddException

To statically track the detachment property, we provide a new instance of Java(X).First, we de�ne the value annotation set, similar to the �le handle example. We usethe partially ordered set XElement = (P({N,D}),⊆). The elements of the partiallyordered set abstract from the possible states of an Element reference. In XElement,N stands for �is null� and D for �de�ned� (is not null). Instead of introducing anadditional status �eld, we attach this annotation set to the parent �eld p of theAttribute.Again, we need to provide the method signature that speci�es the e�ect on the

activity annotations:

Element setAttribute(Attribute{p : 〈M(N) M(D), Element〉} attr).

Here, we restrict and state a type change for the argument of the function. Thesignature states that setAttribute needs a detached Attribute, indicated by theannotation on �eld p, and changes it into an attached one. Again, the e�ect onlyapplies to the active annotation, as only active references allow strong update.Type checking the example from the beginning of this Section with the introduced

signature for setAttribute leads to a type error. The typing assumes that new

Attribute(...) creates an attribute node without a parent, that is its p �eld hasthe annotation M(N). The comments indicate the typing after execution of therespective statement.

Element p1 = ...;

Element p2 = ...;

Attribute attr = new Attribute(...);

// attr : Attribute{p : 〈M(N), Element〉}p1.setAttribute(attr); // attr : Attribute{p : 〈M(D), Element〉}p2.setAttribute(attr); // type error: N required, D given

20

Page 33: Java(X) A Type-Based Program Analysis Framework

2.2 JDOM Type Analysis

JDOM also has API methods that introduce aliasing. For example, the detach()method removes an attribute from the element it is attached to (if any) and leavesit in a detached state. The method modi�es its receiver object and returns it, too.One possible type signature is

Attribute{p : 〈M(N), Element〉}[Attribute{p : 〈M(D) O, Element〉}] detach()

The type change in the square brackets again speci�es the e�ect of a method invo-cation on the receiver type. Before calling detach, the receiver object must havean active parent �eld and it must be attached. After the call, the receiver's parent�eld type is inactive. The method returns a detached active reference.This type is not the only possible choice. We could just as well leave the receiver

active and make the return type inactive. Each choice �xes a particular usagepattern, but there is no reason to prefer one over the other. We refer to thisexample in Chapter 6, where we discuss polymorphism for Java(X).

21

Page 34: Java(X) A Type-Based Program Analysis Framework
Page 35: Java(X) A Type-Based Program Analysis Framework

3Formal System

Java(X) is closely related to Java. It is an object-oriented language with classesand methods and models the core language of Java. In this chapter we considera core language that omits inheritance, casts, interfaces, and abstract methods.The formalization of Java(X) is inspired by ClassicJava [21]. To keep the for-malization and later on the proofs manageable, we concentrate on CoreJava(X),a suitable subset of Java(X) that comprises all properties mentioned in the lastchapters.

Chapter Outline This chapter lays the formal foundation of CoreJava(X). Westart with the syntax of programs, types, and annotations. Next, we describe theformal requirements for the annotation instances and de�ne the dynamic seman-tics with a small-step operational semantics. Finally, we provide a detailed staticsemantics with several auxiliary functions for the type system.

3.1 Syntax

Figure 3.1 de�nes the syntax of CoreJava(X) and Figure 3.2 de�nes some lookupfunctions to access di�erent parts of the syntax.

A program P consists of a list of class de�nitions and a main expression e. Aclass is de�ned by its header with name, a list of named �eld declarations, and a listof methods. We defer the explanation for method declarations until we introducethe type syntax for t and type changes .

Values and Expressions A value v is either null or a variable x, drawn from avariable set VarName. To simplify the formalization, we present the expressions ina restricted, A-normal form [20]. This simpli�es the proofs and the formal system.

23

Page 36: Java(X) A Type-Based Program Analysis Framework

3 Formal System

Figure 3.1 Syntax for CoreJava(X). Type Syntax in Fig. 3.3

P ::= defn e

defn ::= class c {c f ; meth}meth ::= t [t t]m(t t x) { e }

v ::= x | nulle ::= v | new` c(v) | v.m(v)| letx = v.f in e | set v.f = v in e| letx = e in e | if v then e else e| join v = v.f from e

c ∈ ClassName, f ∈ FieldName,m ∈ MethodName, x ∈ VarName, ` ∈ Label

Figure 3.2 Lookup functions

class c {c f ; meth} ∈ P

fieldsP (c) = c f

class c {c f ; meth} ∈ Pt [t0 t′0]m(ti t′i xi) { e } ∈ meth

mbodyP (c,m) = xi{e}

class c {c f ; meth} ∈ P t [t1 t′1]m(ti t′i xii∈{2,...,n}

) { e } ∈ meth

mtypeP (〈va, c{f : s}〉,m) = ti t′i → t

Figure 3.3 Type Syntax

�eld type s ::= 〈aa, t〉 (coinductively)annotated type t ::= 〈va, u〉 (coinductively)

simple type u ::= c{f : s} (coinductively)

activity annotation aa ::= M(va) | ♦ | Ovalue annotation va ∈ Xc

environment A ::= ∅ | A, x : t

24

Page 37: Java(X) A Type-Based Program Analysis Framework

3.1 Syntax

The A-normal form makes all sequencing explicit by using let and set, respectively,for �eld access and modi�cation.

The programs we presented in Chapter 2 and any program with the same scopeof operation can be transformed into this A-normal form.

Expressions comprise values, generating objects, method calls, �eld access, �eldmodi�cation, a standard let expression, and a conditional. In addition, there existsan intermediate join expression join v = v.f from e which does not occur in usercode but which arises during execution.

A new expression carries a label `. This enables the initial annotations of a newobject to depend on the location where the object is created in the source code.Adapted to the �le handler, the user may specify new` File(...) such that, de-pending on the label `, it once generates a �le that is read-only or both, writableand readable, provided the annotations support such a di�erentiation. This labelenables Java(X) to perform more complex program analysis, as it enables di�er-ent objects of the same Java-type to pass through completely di�erent object lifecycles.

Field access let x = v.f in e is combined with variable binding to increase theprecision of the system. The idea is that v.f lends capabilities to the binding of xwhile evaluating e. Afterwards, the lent capabilities are joined back to the ones ofv.f using the mentioned intermediate join expression.

Field update set v.f = v′ in e �rst sets the �eld and then evaluates e. It doesnot return a result because doing so would create an alias for v′, which we wantto avoid as additional aliases mean additional, here unessential, e�ort to managethem, which �nally would complicate the according typing rule.

The conditional if v then e else e′ expression decides according to v which branchto take. If v ≡ null, evaluation continues with e′, otherwise with e. Due to theA-normal form, a let-expression has to precede the if and bind the result of theexpression to a variable whenever an expression should trigger the branching.

Finally, the intermediate join v = v′.f from e states that the capabilities of vand v′.f have to be rejoined after the evaluation of e. This expression arises solelyduring the execution of a program and serves as a memento for the aforementionedcapability lending. The join expression remembers which two variables participatein the lending and thereby enables the backwards transfer of these capabilities.

The sets for variable, class, �eld, and method names have to be �nite. Forthe formal system we further assume that all these names are distinct and uniquethroughout the system.

Type and Annotation Syntax The type and annotation syntax is shown in Fig-ure 3.3. We �rst describe the type syntax and then explain the di�erent annotationsin more detail.

25

Page 38: Java(X) A Type-Based Program Analysis Framework

3 Formal System

The type syntax is de�ned coinductively and has three levels:

• A �eld type, s, attaches an activity annotation aa to an annotated type t.This annotation evolves by the use of the according variable.

• An annotated type, t, attaches a value annotation va to a simple type u. Thisannotation describes a persistent property of the objects of type u in a sum-mary approximation. Once set, this annotation may not change throughouta program execution any more.

• A simple type, u, packages a class name with a �eld environment. This �eldenvironment matches every �eld of the class with a �eld type s.

Java(X) uses two kinds of annotations.The value annotation va is drawn from a class-speci�c partially ordered anno-

tation set (Xc ,≤) where c is the corresponding class. The instantiation of theframework determines Xc . The instances of Xc are further described in Section 3.2.The value annotations are used �rst for the annotated type t, where they representthe persistent property of an object. This value annotation is rather inexact andonly provides some rudimentary information to the user, for example, it abstractsall states an object passes in its life time. The second usage inside the activityannotation (see next paragraph) represents a more exact state of the object.In addition to the value annotation, we have activity annotations aa. The activity

annotations state whether a �eld may be modi�ed through this reference. Java(X)supports three activity annotations:

Active M A �eld that has an active annotation may be changed through this ref-erence. In addition, the active annotation carries a precise value annotationthat represents the current state of the object, denoted by M(va).

Inactive O An inactive annotation states that this reference is only capable toread the �eld. No changes may be applied through this reference. Still, thereference may use the general value annotation provided for the object.

Semiactive ♦ Finally, the semiactive activity annotation ♦ allows updates but doesnot yield a re�ned value annotation. Semiactive �elds may only use thesummary value annotations like inactive �elds. A semiactive �eld behaveslike a standard Java �eld, as it does not impose any restrictions on aliases orupdates.

An activity annotation acts locally on a single �eld. It does neither a�ect sibling�elds nor descendants: their annotations are completely independent. The activityannotation is also reference speci�c: each alias for the same object may have a dif-ferent (but compatible) activity annotation on its �elds. To ensure that the value

26

Page 39: Java(X) A Type-Based Program Analysis Framework

3.2 Instances of Java(X)

annotation inside an active �eld is exact and up-to-date, there is at most one activereference for a �eld of a variable at any program location. This ensures that noother reference changes the �eld while another relies on (or also changes) the cor-responding annotation. In addition to the unique active activity annotation, theremay be arbitrarily many inactive references. Whenever the environment containsa semiactive reference, there is no active reference allowed for the same �eld in thisenvironment.

To simplify the use of the type annotation, we abbreviate s = 〈M(va), t〉 =〈M(va), 〈va ′, u〉〉 = 〈M(va), va ′, u〉, analog for the other type syntax.

For any type s = 〈M(va), va ′, u〉, Java(X) ensures that the value annotation vais at least as precise as the summary annotation va ′ because assignments changeva, but leave va ′ constant. The well-formedness predicate on types (see Figure 3.9)ensures this property via va ≤ va ′.Because the annotations are part of the type, any use of a variable may change

the type of this variable. Java(X) denotes this type change by a serrated arrow . Such stated type changes only a�ect the annotations, the underlying class typeremains unchanged.

Method Declaration A method de�nition t [t′ t′′]m(ti t′i xi) { e } speci�esthe return type t, the type t′ of its receiver, and the method name m. The typeof the receiver may change from t′ to t′′ when a method is executed, denotedby a type change [t′ t′′] in square brackets. The arguments for the methodsfollow the method name in normal brackets (ti t′i xi). Calling a method mayalso yield a type change of these arguments, again denoted by ti t′i. Finally,the method declaration contains an expression as body of the method. The typesystem ensures that all type changes only change the annotations, the underlyingJava-type remains untouched.

3.2 Instances of Java(X)

The re�nement designer of Java(X) speci�es a value annotation set for every classin the program. To generate a correct instance, the user de�nes for each class c

• a partially ordered set (Xc ,≤) with greatest element for the value annotations,

• a non-empty predicate ρc ⊆ Xc of droppable annotations, and

• predicates Rnew`,c , R

nullc ⊆ Xc , for each label `.

For the formal system, we assume that c 6= c′ implies Xc ∩ Xc′ = ∅ and we setρ = ∪cρc .

27

Page 40: Java(X) A Type-Based Program Analysis Framework

3 Formal System

We need the partial order to ensure that the summary value annotations arealways more general than the ones inside of the activity annotations.

The predicates Rnullc and Rnew

`,c provide the persistent annotations for the null

reference and for objects created at program location `, respectively. Indeed, themotivation to include ` in the formal presentation at all is the ability to de�nepredicates that depend on the creation location, as seen above for the �le handler.

Example 3.1. To illustrate the formal instances we revisit the �le handler example.For this example we de�ned the partially ordered set

XFStat = (P({O,C}),⊆)

The annotation O models an open �le and C a closed �le. Even though ourexample does not use the annotations ∅ and OC, we need at least the secondone for the summary value annotation.

Because open �les should not be discarded, we de�ne the droppability setρFStat = {{C}, ∅}. A �le with annotation ∅ is not open, and is thereforedroppable, too.

The two predicates for null and new �les are de�ned as RnullFStat(va) ⇔ False

and Rnew`,FStat(va) ⇔ (` = `o ⇒ {O} ≤ va) ∧ (` = `c ⇒ {C} ≤ va) where `o

and `c are the program locations where the FStat objects for state open andclosed are de�ned, respectively.

These de�nitions satisfy the above stated requirements and enable Java(X)to verify the �le handler example as stated in Section 2.1. That is, a �le handleris droppable as long as its status cannot be open and CoreJava(X) ensuresthat the user may only read open �le handlers.

3.3 Dynamic Semantics

Figure 3.4 de�nes the dynamic semantics of Java(X) as a small-step operational se-mantics. Figure 3.5 contains the reductions for erroneous programs. The judgmentP ` 〈e,S〉 ↪→ 〈e′,S ′〉 describes a single evaluation step of an expression e understore S governed by program P . The evaluation step returns a reduced expressione′, and an updated store S ′.A value is either a location l or null, the set of locations is a subset of the variable

names. A �eld map F maps �eld names to values, and a store S is a mapping fromlocations l to objects 〈c, `,F〉 where c is the class of the object, ` is the label fromthe object creation, and the �eld map F is the corresponding �eld map that recordsthe values w of its instance �elds. The notation S, l 7→ 〈c, `,F〉 assumes that Sdoes not bind l, whereas F [f 7→ w] implies that F contains a binding for f whichis updated to w.

28

Page 41: Java(X) A Type-Based Program Analysis Framework

3.3 Dynamic Semantics

Figure 3.4 Dynamic semantics.

De�nitions:

Value 3 w ::= l | null

l ∈ Loc ⊆ VarName

F ∈ FieldMap = FieldName→ Value

S ∈ Store = Loc→ ClassName× Label× FieldMap

Reduction rules:

RNew

fieldsP (c) = ci fi fresh(l)

P ` 〈new` c(wi);S〉 ↪→ 〈l;S, l 7→ 〈c, `, fi 7→ wi〉〉RAcc

S(l) = 〈c, `,F〉 F(f ) = w

P ` 〈letx = l.f in e;S〉 ↪→ 〈joinw = l.f from [w/x]e;S〉RJoinLoc

P ` 〈joinw = l.f fromw′;S〉 ↪→ 〈w′;S〉

RSet

P ` 〈set l.f = w in e;S, l 7→ 〈c, `,F〉〉 ↪→ 〈e;S, l 7→ 〈c, `,F [f 7→ w]〉〉

RCall

S(l) = 〈c, `,F〉 mbodyP (c,m) = xi{e}P ` 〈l.m(wi);S〉 ↪→ 〈let this, xi = l, wi in e;S〉RLetLoc

P ` 〈letx = w in e;S〉 ↪→ 〈[w/x]e;S〉

RCondLoc

P ` 〈if l then e1 else e2;S〉 ↪→ 〈e1;S〉

RCondNull

P ` 〈if null then e1 else e2;S〉 ↪→ 〈e2;S〉

RLetExp

P ` 〈e1;S1〉 ↪→ 〈e2;S2〉P ` 〈letx = e1 in e;S1〉 ↪→ 〈letx = e2 in e;S2〉RJoinExp

P ` 〈e1;S1〉 ↪→ 〈e2;S2〉P ` 〈joinw = l.f from e1;S1〉 ↪→ 〈joinw = l.f from e2;S2〉

29

Page 42: Java(X) A Type-Based Program Analysis Framework

3 Formal System

Figure 3.5 Dynamic semantics; dereferencing null.

Reduction rules, dereferencing null:

RAccNull

P ` 〈letx = null.f in e;S〉 ↪→ 〈error: dereferenced null;S〉

RSetNull

P ` 〈set null.f = w in e;S〉 ↪→ 〈error: dereferenced null;S〉

RNullCall

P ` 〈null.m(w);S〉 ↪→ 〈error: dereferenced null;S〉

The reduction for new, RNew, introduces a new object into the store and returnsthe corresponding, su�ciently fresh, location. The reduction rules for the standardlet, with RLetExp and RLetLoc, and for the conditional if, with RCondLocand RCondNull, are standard.

The reduction rules RAcc for �eld access, letx = v.f in e, and RJoinExp

for variable joining, joinw = l.f from e, belong together. They implement theaforementioned lending of the �eld capabilities to x. Reducing the let with RAccleaves behind a join expression that remembers the lending for the duration ofe's evaluation. Once the body of the let/join is reduced to a value, the join

reduces. The reduction RJoinLoc reduces simply to its body, here a value. Thejoin expression therefore has no operational signi�cance. We need the join to makethe type system happy and to make the soundness proof manageable.

RSet, the reduction for set, is standard besides that it is sequenced with theevaluation of another expression to avoid returning an explicit value from set.

A method invocation reduces via RCall to the corresponding method bodywrapped in let expressions that bind the formal parameters to the actual ones.Here, letxi = wi in e denotes a sequence of standard let expressions. Opera-tionally, this wrapping is not necessary but it simpli�es the soundness proof byseparating concerns.

While the reduction of let for locations, RLetLoc, is completely standard, dueto our type system, it allows us to have a similar capability lending as for the �eldaccess. The same holds thereby for method calls that reduce to a sequence of let.Capabilities are only lent to the method body and returned afterwards, at least ifthey are not consumed by the method body.

The reduction rules in Figure 3.5 capture the di�erent possibilities of an access tonull. The static type system is responsible for preventing all other possible errorsthat would lead to a stuck execution.

30

Page 43: Java(X) A Type-Based Program Analysis Framework

3.4 Static Semantics

Figure 3.6 Droppability relation.

DropAct

va ∈ ρ t ρ(t)

s ρ(〈M(va), t〉)

DropSemiact

s ρ(〈♦, t〉)

DropInact

t ρ(t)

s ρ(〈O, t〉)

DropType

(∀i) s ρ(si)

t ρ(〈va, c{fi : si}〉)

3.4 Static Semantics

The static semantics ensures the properties of CoreJava(X). The three main guar-antees are, that active �elds are only generated with a new expression, no capabilitiesare illegally duplicated, and no reference with undroppable annotations is lost. We�rst describe the essential droppability and splitting, which is responsible for aliashandling without violating the constraints we described for the capabilities and thecorresponding activity annotations. Next, before actually de�ning the typing rela-tion, we introduce several auxiliary relations, including e�ect application for typeupdates and merging of capabilities.

Boxed premises in the rules serve as extension points provided by the user via aninstance of the framework.

Droppability The droppability rules in Figure 3.6 prevent the program from drop-ping any references that still hold some undroppable annotations. As the examplesin Section 2.1 showed, some references need to be cleaned up properly to avoidmalfunctioning. All the �elds of a reference have to be checked for droppability,therefore t ρ(t) �rst decomposes the reference with DropType and then s ρ(s)checks the droppability on the activity annotations involved. An inactive referenceis not responsible for the �eld itself and therefore may drop it any time, still, the un-derlying �elds have to be checked separately because these �elds possibly carry anactive reference again. An active �eld is only droppable if the associated value anno-tation is droppable and all subsequent �elds are, too. Whether a value annotationis droppable or not is de�ned by the user via the instance for CoreJava(X) andthe droppable set ρc . Finally, any semiactive �eld is droppable. Well-formednessof types ensures that a semiactive reference does not carry undroppable capabil-ities. The reason for this is that we intend that semiactive �elds are allowed toget updated any time. Therefore, we potentially would lose access to subsequentcapabilities. To avoid the explicit need of ρc for every class c, we assume for theformal system that all annotations throughout the system are distinct and thereforecombine the droppable relation over all classes to ρ.

31

Page 44: Java(X) A Type-Based Program Analysis Framework

3 Formal System

Figure 3.7 Splitting relation.

SplitType

(∀i) s si � s′i | s′′it 〈va, c{fi : si}〉 � 〈va, c{fi : s′i}〉 | 〈va, c{fi : s′′i }〉

SplitSType

aa aa � aa ′ | aa ′′ t t � t′ | t′′

s 〈aa, t〉 � 〈aa ′, t′〉 | 〈aa ′′, t′′〉SplitActFst

aa M(va) � M(va) | OSplitActSnd

aa M(va) � O | M(va)SplitSemiact

aa ♦ � ♦ | ♦SplitInact

aa O � O | O

Splitting We want to allow the programmer to introduce aliases to any reference,yet we need to ensure the above mentioned properties to exactly track the objectstate for the active reference. To this end, we introduce a splitting relation inFigure 3.7 that splits capabilities between each alias and every use of a resource.Whenever a program uses the same variable multiple times, each use of the variablemay receive a di�erent type. Splitting is responsible that there exists only oneactive activity annotation for any location. Thereby, this activity annotation hasto get split up from the initial capabilities of the according �eld. The splittingrelation takes a type, splits its capabilities and returns the two new types. As theunderlying structure and Java-type of any type remains unchanged by splitting,it su�ces to de�ne splitting directly on the level of activity annotations. To reachthe activity annotations, SplitType and SplitSType �rst decompose the type.Then, SplitInact splits an inactive �eld into two inactive references. Similarly,SplitSemiact simply propagates the semiactive capability. The only two splittingrules that are really splitting capabilities are SplitActFst and SplitActSnd.These rules split into one active and one inactive annotation to the �rst or secondalias. The value annotation inside stays untouched. Thus, splitting ensures thatat most one type for a �eld reference receives an active annotation and this activeannotation is never lost accidentally. Whenever we split t t � t′ | t′′, all activereferences of t are distributed between t′ and t′′. Di�erent �elds may pass theircapabilities to di�erent aliases. The splitting relation is used for every access toa variable or an according �eld, no matter if we only lend capabilities, pass themto a method call, or really introduce an additional alias. As we will see, splittingis also capable of merging lent capabilities back together again. To do so, we

32

Page 45: Java(X) A Type-Based Program Analysis Framework

3.4 Static Semantics

Figure 3.8 Fully active types.

TypeFullAct

(∀i) M s si

M s 〈M(va), va ′, c{fi : si}〉

TypeFullSemiact

(∀i) ♦ s si

♦ s 〈♦, va, c{fi : si}〉TypeFullNonact

aa ∈ {O,♦} (∀i) O/♦ s si

O/♦ s 〈aa, va, c{fi : si}〉

Figure 3.9 Well-formedness of types.

WFType

va ∈ Xc fieldsP (c) = ci fi (∀i) si = 〈aai, vai, ci{fij : sij}〉 P s wf (si)

P t wf (〈va, c{fi : si}〉)WFAct

va ≤ va ′ P t wf (〈va ′, u〉)P s wf (〈M(va), va ′, u〉)

WFSemiact

♦ s 〈♦, t〉 P t wf (t)P s wf (〈♦, t〉)

WFInact

P t wf (t)P s wf (〈O, t〉)

apply the splitting backwards. Whenever we use the splitting relation to merge tworeferences back into one, the type soundness has to take care that we never combineincompatible annotations, like two active ones or a semiactive annotation with aninactive one.

Fully Active Types Figure 3.8 provides three relations to ensure that a type is fullyactive, fully semiactive, or carries no active activity annotation at all. Each relationchecks this property corecursively, such that it holds for all subsequent �elds, too.The �rst two, TypeFullAct and TypeFullSemiact, are needed to type updatesand for the well-formedness of types. The third one, TypeFullNonact, is notused in the static type system and only needed during the later proofs for soundnessof the formal system.

Well-formedness The well-formedness relation as shown in Figure 3.9 ensuresseveral properties for types:

• First, WFType checks the basic properties on types. That is, the value an-

33

Page 46: Java(X) A Type-Based Program Analysis Framework

3 Formal System

notation is taken from the user-de�ned annotation set and that the �eld envi-ronment is correct with respect to the �eld declarations of the correspondingclass, including the use of these classes inside every �eld type si.

• WFAct ensures that the value annotation inside of an active annotation ofa �eld type is not weaker than the value annotation of an annotated type t.This guarantees that the general value annotation is always a summary of allvalue annotations inside of an active annotation throughout the program.

• Well-formedness is the central point to ensure the correct use of semiactivereferences. WithWFSemiact we strictly separate semiactive from any activereferences, and therefore from inactive references, too. Whenever we have asemiactive �eld type s, we ensure with ♦ s s that any subsequent, meaning'below' in the structure, �eld is semiactive, too. The reason for this choicelies in the intended use of semiactive �elds: The user may change, updateor drop semiactive references at any time. On the �rst sight, droppabilityof semiactive �elds only has to ensure that all subsequent capabilities aredroppable. But DropSemiact does not even check the subsequent �elds.The reason is, that an update of a semiactive �eld could destroy an accesspath to a subsequent active �eld, which must be prevented as this could causean unintended loss of undroppable references. Therefore, we ensure that thesubsequent �elds are not only droppable, but fully semiactive. We additionallyexclude inactive references as inactive �elds should not get updated, especiallybecause there may be another access path through this location that assumesto have the active, and therefore exact, reference to this object.

• Besides the general restriction on types, inactive references impose no furtherrestrictions.

Subactivity The subactivity, as we provide it in Figure 3.10, is a simple subtypingon activity annotations. Similar to splitting, we �rst break types via SubactTypeand SubactSType down to activity annotations. Then, SubactAct states thatan active activity annotation is subactive to any other activity annotation, evenif the value annotation inside changes, SubactSemiact and SubactInact con-clude the subactivity on activity annotations. Finally, SubactEmptyEnv andSubactEnv extend the subactivity to environments. This relation is crucial forthe soundness proof. Still, we need it to type the intermediate join-expression.

E�ect application The e�ect application relation A A := A′ ↓ vi : ti t′i (Fig-ure 3.11) serves two purposes: It is used in the rules for method application and fora restricted version of the let expression. The intuitive use arises from the method

34

Page 47: Java(X) A Type-Based Program Analysis Framework

3.4 Static Semantics

Figure 3.10 Subactive relation.

SubactType

(∀i) s si � s′it 〈va, c{fi : si}〉 � 〈va, c{fi : s′i}〉

SubactSType

t t � t′ aa aa � aa ′

s 〈aa, t〉 � 〈aa ′, t′〉SubactAct

aa ∈ {M(va ′),♦,O}aa M(va) � aa

SubactSemiact

aa ∈ {♦,O}aa ♦ � aa

SubactInact

aa O � O

SubactEmptyEnv

A ∅ � ∅

SubactEnv

A A � A′t t � t′

A A, x : t � A′, x : t′

Figure 3.11 E�ect Application.

EANull

A A := A ↓ null : ti t′iEAVar

vj = x

t t := t′ ↓ tj t′j A A := A′, x : t ↓ vi : ti t′ii6=j

A A := A′, x : t′ ↓ vi : ti t′iEAType

(∀i) s si := s′i ↓ s′′i s′′′i

t 〈va, c{fi : si}〉 := 〈va, c{fi : s′i}〉 ↓ 〈va, c{fi : s′′i }〉 〈va, c{fi : s′′′i }〉EASType

aa aa := aa ′ ↓ aa ′′ aa ′′′ t t := t′ ↓ t′′ t′′′

s 〈aa, t〉 := 〈aa ′, t′〉 ↓ 〈aa ′′, t′′〉 〈aa ′′′, t′′′〉EAActive

aa 6= ♦aa aa := aa ′ ↓ M(va) aa

EANonactive

aa ′ ∈ {♦,O}aa aa := aa ↓ aa ′ aa ′′

35

Page 48: Java(X) A Type-Based Program Analysis Framework

3 Formal System

application, where the e�ect application states the type change on annotation level.Here, A A := A′ ↓ v : t t′ de�nes the new environment A that results fromapplying the type change t t′ to variable v's type in A′. As a method call mayinvolve several arguments and therefore several type changes at once, we combineall these changes in one notation, especially as there can occur two or more changesto the same variable. This potentially multiple change of one variable is the reasonwhy for the judgment t t := t′ ↓ t′′ t′′′ the types t′ and t′′ can di�er. It is partof the type system and type soundness proof to ensure that no active references getlost with this e�ect application.The above de�nition of e�ect application allows us to transfer the type state

changes from one alias that goes out of scope to another. Therefore, we are ableto use it for the capability lending, too. For example, the expression letx = y in eintroduces a new alias x for a given variable y. Inside e, the same object can beused through both references, x and y, which also may change their types. As soonas the expression leaves the scope of x, we have to merge the type change of xback into y to avoid loss of such a change, or worse, loss of capabilities that are notdroppable in x. That is, the e�ect application is able to merge the two new typesof x and y even if both of them changed during the evaluation of the expression e.This also reveals the fact that the e�ect application is somehow related to the aboveintroduced splitting relation, as both take part in the capability lending. Basically,splitting separates the capabilities, the e�ect application is able to return the lentactivities, even if they changed on the way. The type soundness proof will take acloser look at the mechanism of how these two relations work together.EANull handles the e�ect application for null. If vj is a variable x , then the

new type for x is t de�ned by EAVar as t t := A′(x ) ↓ tj t′j . The e�ectapplication relation on types, t t := t′ ↓ t′′ t′′′, changes at most the annotationsof t′ for which the corresponding annotation of t′′ is active but leaves the otherannotations of t′ intact (EAActive). This enables us to use this relation, even iflending of capabilities did not happen, i.e. the original variable kept the capabilities.In addition, as we strictly separate semiactive references from references whosecapabilities are actually tracked with CoreJava(X), we need for technical reasonsthat the according �eld in t′′′ is not semiactive. That is, no type change froman active reference towards a semiactive reference must take place. Whenever theactivity annotation for a �eld in t′′ is inactive or semiactive, we keep the capabilitiesof t′, as stated in EANonactive. Again, the formal system has to ensure that achange from inactive towards active does not occur.That is, the e�ect application imposes several prerequisites on its use. The most

important and basic one is that whenever we have t t := t′ ↓ t′′ t′′′, all partici-pating types may only di�er in the activity annotations, the underlying Java-typeand the summary value annotations have to be the same for every type. Next, atype change from active to semiactive is prohibited, and of course any change from

36

Page 49: Java(X) A Type-Based Program Analysis Framework

3.4 Static Semantics

Figure 3.12 Type access.

TypeAccess

sj = 〈aa, t〉 t t � t1 | t2 s′j = 〈aa, t1〉

t 〈va, c{fi : si}〉 = 〈va, c{fj : s′j ; fi : si}〉 |fj t2

Figure 3.13 Type update.

TypeUpdateAct

M s sj

t ρ(〈vaj , uj〉) sj = 〈aaj , vaj , uj〉 s′j = 〈M(va), va, u〉 s wf (s′j)

c{fi : si} ` fj ← 〈va, u〉 � c{fj : s′j ; fi : sii6=j}

TypeUpdateSemiact

♦ s sj sj = 〈♦, t〉c{fi : si} ` fj ← t � c{fi : si}

inactive or semiactive towards active is excluded, too. The type soundness will takeinto account and prove all these properties of the e�ect application and its use inCoreJava(X).

Expression Typing Finally, the typing judgment for expressions, P ;A e e : t� A′,assigns a type t and an updated environment A′ (after e's evaluation) to expres-sion e in the context of program P and environment A. Figure 3.15 contains thecorresponding rules that we describe in the following. Figure 3.16 contains some ad-ditional rules to type the intermediate expressions that arise during the evaluationof an expression.

The typing rules use several additional auxiliary judgments that we describedirectly with the according inference rule.

• In the variable rule, TVar, each use of a variable splits o� the needed capa-bilities and passes the remaining capabilities onto subsequent uses. As everyvariable access uses this rule, every use of a variable potentially changes itstype.

• The rule for null relies on an auxiliary judgment P null t (Figure 3.14).NullType ensures that t is well-formed and carries a suitable value annota-tion.

37

Page 50: Java(X) A Type-Based Program Analysis Framework

3 Formal System

Figure 3.14 Auxiliaries.

NullType

t = 〈va, c{. . . }〉 Rnullc (va) P t wf (t)

P null t

TMultiVal

(∀i) P ;Ai−1 e vi : ti � Ai

P ;A0 e vi : ti � An

• The rule TNew determines an annotation for the newly created object withRnew. The judgment P ;A e v : t � A′ is an abbreviation for sequenced typingof each vi. Its de�nition in Figure 3.14 takes into account that the sequencev possibly uses the same variable multiple times. The activity annotationsfor every �eld may be chosen freely, besides the restrictions imposed by thewell-formedness. This design choice simpli�es the proofs and avoids anotheruser interaction. Whenever the object is used within any method call, theannotations are implicitly de�ned. Therefore, these free activity annotationsare only unrestricted if we introduce an object and directly drop it againwithout using it.

• The rule TAcc for accessing �eld f performs the already mentioned lendingof capabilities. The type of the dereferenced object lends its capabilities at�eld f through the type access judgment t ty = t′y |f tx (Figure 3.12) to theextracted value. TypeAccess models splitting on one �eld of the given type.It is basically an abbreviation and is used similar to splitting. After typingthe body expression e with the resulting types, it merges the �nal types backinto the type of the original reference. To do so, it uses TypeAccess andtherefore the splitting relation backwards.

This rule has a number of related rules in Figure 3.16: TAccNull, TJoin,and TJoinNull. They treat the case that the dereferenced object is null andthe intermediate join expression that arises from reducing the �eld access.There is a special rule for a join expression where the extracted value is null,too.

• Field assignment set x .f = v in e (TSet) changes the type of �eld f in x 'stype by using the type update judgment P ;u ` f ← t � u′ (Figure 3.13)which states that �eld f of an object with type u can be assigned a value oftype t while modifying the object's type to u′. Two rules de�ne this updatejudgment:

� The �rst rule, TypeUpdateAct, allows a strong update of f whichcan change its type. It requires the previous type of �eld f to be fully

38

Page 51: Java(X) A Type-Based Program Analysis Framework

3.4 Static Semantics

Figure 3.15 Typing rules for expressions.

TVar

t t � t1 | t2P ;A, x : t e x : t1 � A, x : t2

TNull

P null t

P ;A e null : t � ATNew

P ;A e vi : ti � A′ Rnew`,c (va) t = 〈va, c{fi : 〈aai, ti〉}〉 P t wf (t)

P ;A e new` c(vi) : t � A′

TAcc

ty = t′y |f tx P ;A, y : t′y, x : tx e e : t � A′, y : t′′y, x : t′x t′′′y = t′′y |f t′xP ;A, y : ty e letx = y.f in e : t � A′, y : t′′′y

TSet

P ;A e v : t � A′, x : 〈va, u〉u ` f ← t � u′ P ;A′, x : 〈va, u′〉 e e : t′ � A′′

P ;A e setx.f = v in e : t′ � A′′

TCall

P ;A e vi : ti � A′ mtypeP (t1,m) = ti t′i → t A A′′ := A ↓ vi : ti t′i

P ;A e v1.m(vii∈{2,...n}) : t � A′′

TLetExp

P ;A e e1 : t1 � A1 P ;A1, x : t1 e e2 : t2 � A2, x : t′1 t ρ(t′1)P ;A e letx = e1 in e2 : t2 � A2

TLetVar

P ;A e v : t1 � A1

P ;A1, x : t1 e e : t � A2, x : t2 A A3 := A2 ↓ v : t1 t2

P ;A e letx = v in e : t � A3

TCond

P ;A e v : t′ � A′′

P ;A e e1 : t � A′ P ;A e e2 : t � A′

P ;A e if v then e2 else e3 : t � A′

39

Page 52: Java(X) A Type-Based Program Analysis Framework

3 Formal System

Figure 3.16 Typing rules for intermediate expressions.

TAccNull

P ;A, x : tx e e : t � A′, x : t′x P t wf (tx)P ;A e letx = null.f in e : t � A′

TJoin

tz � t′′′z tx � t′xty = t′y |f tx t′′′y = t′′y |f t′xt′z � tz | tx t′′z � t′′′z | t′x

P ;A, z : t′z, y : t′y e e : t � A′, l : t′′z , y : t′′yP ;A, z : tz, y : ty e join z = y.f from e : t � A′, z : t′′′z , y : t′′′y

TJoinNull

ty = t′y |f tx t′′′y = t′′y |f t′x tx � t′x P ;A, y : t′y e e : t � A′, y : t′′yP ;A, y : ty e join null = y.f from e : t � A′, y : t′′′y

TSetNull

P ;A e v : t′ � A′ P ;A′e e : t � A′′

P ;A e set null.f = v in e : t � A′′

Figure 3.17 Typing rules for programs.

P = defni e join-free(e)(∀i) P ` defni class names in defni disjoint P ; ∅ e e : t � ∅

` P

(∀i) class ci {c′ f ′; meth ′} ∈ P (∀j) P ; c ` methj

�eld names fi disjoint method names in methj disjoint

P ` class c {ci fi; methj}

t0 = 〈va, c{f : s}〉(∀i ∈ {0, . . . , n}) P t wf (ti) P ; this : t0, xi : ti e e : t � this : t′0, xi : t′i

P ; c ` t [t0 t′0]m(ti t′i xi) { e }

40

Page 53: Java(X) A Type-Based Program Analysis Framework

3.4 Static Semantics

active (judgment M s s), that is, all subsequent �elds have to carry anactive annotation, too. Whenever there is one inactive annotation forany of the subsequent �elds, the according �eld possibly gets an updatethrough another reference, which would invalidate the guarantees ofCoreJava(X). The same holds for a semiactive annotation.

� The second rule deals with ordinary updates outside the capability andexact resource tracking of CoreJava(X). It requires that the type off is fully semiactive (judgment ♦ s s) because overwriting an inactive�eld would result in an invalid typing assumption about a reference car-rying an active annotation for this �eld. Notice that the fully semiactiverequirement is already assured by the well-formed judgment.

TSet has one related rule in Figure 3.16: TSetNull. This rule managesthe case where a �eld of null is set. As user code may not access null, thisis only an intermediate expression.

• The rule for method calls, TCall, uses the above mentioned e�ect applicationrelation to propagate the type changes of the method signature to the resultingtype environment. The type changes imposed are drawn from the methodlookup. Again, TMultiVal and the de�nition of the e�ect application allowto use the same variable multiple times for one method call, if it is possible toretrieve all necessary capabilities for this method call. Similar to TAcc, thecapabilities are only lent to the method call. Afterwards they are transferredback into the environment if they are not consumed by the method body,indicated by the according type change. The value annotations inside anactive annotation may of course change through the method call.

• There are two rules for standard let expressions. TLetExp ensures thatthe type of the let-bound variable is droppable after evaluating the body ofthe let expression. Therefore we do not lend the capabilities to the let-boundvariable, it is dropped afterwards. The rule TLetVar requires a value in itsheader, that is, this restricted let creates an alias of a variable. In this case,the rule implements the lending of capabilities just like described for the �eldaccess rule. It uses e�ect application to merge the changes of the alias backinto the type of the original reference. TLetExp is not able to provide thiscapability lending, as we do not explicitly know where the capabilities comefrom. Therefore, we ensure that the according capabilities are droppable afterthe typing of the body e2.The proof takes care that, where applicable, both rules can be used.

• Finally, we describe the rule TJoin of Figure 3.16. This rule handles thejoin-expression. As user code must not contain any join, this is only an

41

Page 54: Java(X) A Type-Based Program Analysis Framework

3 Formal System

intermediate expression that arises during the execution of a program. Itspurpose is to enable the type soundness proof to show that variable lendingworks in the intended way. In fact, the join-expression simply rememberswhich variables participated in a capability lending via TAcc. The �rsttwo premises that check the subactivity only serve as invariant for the typesoundness and are further explained during the proofs. Informally, the otherpremises introduce a virtual type that is denoted with tx in the inferencerule. This type manages the transfer of capabilities among the participatingvariables. First, we transfer capabilities from ty to tz, then, after typing thebody e, we transfer the remaining capabilities backwards from t′′z to t′′y .

42

Page 55: Java(X) A Type-Based Program Analysis Framework

4Type Soundness

In this chapter we prove that the introduced systemCoreJava(X) is sound and sat-is�es the claimed properties. That is, we show that whenever we have a well typedexpression in a well typed program, the intended properties of CoreJava(X) hold:We prove that there never exists more than one active reference for a �eld, that noreference that is not droppable is lost during execution, and that a well typed pro-gram does not get stuck during execution. Basically we prove type soundness usingthe standard syntactic technique [48], by proving preservation and progress. Here,we extend preservation to ensure the aforementioned properties of CoreJava(X).

Chapter Outline This chapter �rst provides some more notations and de�nitions.Several additional relations that we only need for the proofs are presented in Sec-tion 4.2. Given all prerequisites, we are capable to provide the set-based notationof any coinductive de�ned relation of the formal system. Then, we start the proofs.We use several lemmas, until we can state the preservation and progress theorem,which we �nally combine to the soundness theorem for CoreJava(X).

4.1 Preliminaries

First, we de�ne some additional notations and state some general assumptions tofacilitate the proofs.

Notation If A1 and A2 are two environments with dom(A1) ⊆ dom(A2), thenA2�dom(A1) denotes the environment that restricts A2 to the variables in the domainof A1.

To simplify the proofs and the coinductive de�nitions, we �atten our type hierar-chy and use the following syntax for types, which is equivalent to the one introducedin Figure 3.3:

t ::= 〈va, c{f : 〈aa, t〉}〉

43

Page 56: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Notation To avoid cluttering of the proofs, we use the following abbreviation forall used types. Whenever we introduce some t′1, we implicitly de�ne it with

t′1 = 〈va ′1, c{fi : 〈aa ′1i, t′1i〉}〉

for appropriate c and fi, though the index set for i may be empty. Whenever neededand de�ned, we further expand aa ′1i

= M(va ′1i). This respectively holds for other

type variables t0, . . . , t′′3 and so on.

Notation According to the above �attened de�nition of types, we obtain an indexset for i. Throughout the proofs we use the notation (∀i) whenever we talk aboutall indexes at once, every other use of index i handles one arbitrary i out of theindex set.

De�nition 4.1 (Compatible Types). We say two types are compatible, wheneverthe underlying Java-types and summary value annotations are identical. That is,

they di�er only in their activity annotations.

Remark 4.2. Whenever we have t1 � t2, then t1 and t2 are compatible. As a result,we also obtain the same index set for t1 and t2:

t1 = 〈va1, c{fi : 〈aa1i , t1i〉}〉t2 = 〈va1, c{fi : 〈aa2i , t2i〉}〉

Therefore, we also get, by the de�nition of subactivity and the use of the aboveintroduced notation

aa1i � aa2i(∀i)

For all other relations on types of CoreJava(X) we get the analogous properties.

Remark 4.3. Whenever we use several variable names for one lemma, we poten-tially get an environment like A, a : ta, b : tb. As every variable is unique in theenvironment, we get for the case that a = b directly that ta = tb holds..

4.1.1 Free Variables

We provide the usual de�nition of free variables. It is standard besides the join-case.

De�nition 4.4 (Free Variables fv(e)).fv(e) = fv(e1) ∪ · · · ∪ fv(en)

44

Page 57: Java(X) A Type-Based Program Analysis Framework

4.1 Preliminaries

fv(e) =

{x} e ≡ x∅ e ≡ null

fv(v) e ≡ new` c(v)fv(v) ∪ fv(v) e ≡ v.m(v)fv(v) ∪ fv(e′) \ {x} e ≡ letx = v.f in e′

fv(v) ∪ fv(v′) ∪ fv(e′) e ≡ set v.f = v′ in e′

fv(e′) ∪ fv(e′′) \ {x} e ≡ letx = e′ in e′′

fv(v) ∪ fv(e′) ∪ fv(e′′) e ≡ if v then e′ else e′′

fv(v) ∪ fv(v′) ∪ fv(e′) e ≡ join v = v′.f from e′

4.1.2 Alpha Conversion

To facilitate the proofs we only consider expressions up to alpha conversion [36,Chapter 5.3]. That is, we rename variables until there are no two let-expressionsthat bind the same variable name. Substitution is only de�ned on variables.

Notation [y/x]e denotes the renaming of x to y inside the expression e, withoutcatching bound variables.

4.1.3 Inversion

We use inversion throughout our proofs. There are only few cases where inversion isnot unambiguous, we will take care of that by case distinctions over all possibilities.Exemplary, we prove an inversion lemma for the typing relation.

Lemma 4.5 (Inversion of Typing). Let P ;A e e : t � A′. Then we know, depending

45

Page 58: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

on e, which typing rule was used to type this expression:

e ≡ x ⇒ TVar

e ≡ null ⇒ TNull

e ≡ new` c(v) ⇒ TNew

e ≡ v.m(v) ⇒ TCall

e ≡ letx = x′.f in e′ ⇒ TAcc

e ≡ setx.f = v′ in e′ ⇒ TSet

e ≡ letx = v in e′ ⇒ TLetVar

or TLetExp

e ≡ letx = e′ in e′′ ⇒ TLetExp if e′ 6= ve ≡ if v then e′ else e′′ ⇒ TCond

e ≡ joinx = v′.f from e′ ⇒ TJoin

e ≡ letx = null.f in e′ ⇒ TAccNull

e ≡ join null = v.f from e′ ⇒ TJoinNull

e ≡ set null.f = v in e′ ⇒ TSetNull

Proof. As v.f is not an expression, TAcc is unambiguous. All other cases aretrivial.

4.1.4 E�ect Application

We notice that the e�ect application, as it is de�ned in Figure 3.11, is, wheneverwe have t := t′ ↓ t1 t2, well-de�ned with t1 and t2. That is, we may replace t′

with any compatible type t′ and obtain an explicit type t with t := t′ ↓ t1 t2.This observation holds as the decision which rule is used for e�ect application onthe level of activity annotation only depends on t1. For the other levels of our typehierarchy (s, t, and u) we have a deterministic choice. For rule EAActive wepotentially have a restriction on t2, which we take into account by �xing t2 as well.

4.2 Additional Relations

For type soundness we need some more relations about types to establish propertiesor parts of the proof.

4.2.1 Subdroppability

Figure 4.1 introduces subdroppability. This relation is closely related to subactivity.In addition to subactive, subdroppable assures that whenever t′ is droppable andt ρ� t′ holds, then t is droppable, too. We prove this property in Lemma 4.14 assoon as we set up all prerequisites for the proof.

46

Page 59: Java(X) A Type-Based Program Analysis Framework

4.2 Additional Relations

Figure 4.1 De�nition of ρ�, the Subdroppable Relation

SDAct

aa M(va) ρ� M(va)

SDActDrop

ρ(va)

aa M(va) ρ� O

SDSemiact

aa ♦ρ� ♦

SDInact

aa Oρ� O

SDType

(∀i) aa aaiρ� aa ′i t ti

ρ� t′it 〈va, c{fi : 〈aai, ti〉}〉 ρ� 〈va, c{fi : 〈aa ′i, t′i〉}〉

SDEmpty

A ∅ ρ� ∅

SDEnv

A A ρ� A′t t

ρ� t′

A A, x : t ρ� A′, x : t′

Figure 4.2 De�nition of JF.

JF(x) JF(null) JF(new` c(v)) JF(v.m(v))

join-free(e)JF(letx = v.f in e)

join-free(e)JF(set v.f = v′ in e)

JF(e) join-free(e′)JF(letx = e in e′)

join-free(e) join-free(e′)JF(if v then e else e′)

JF(e)JF(join v = v′.f from e)

4.2.2 Join Free Expressions and null in Join

For preservation and progress we need two properties about the use of the inter-mediate join-expressions that are based on the fact that user code is free of anyjoin.

Figure 3.17 introduces the join-free relation which states that an expression con-tains no join at all. During a reduction sequence there may of course occur anintermediate join-expression. Still, our proof relies on the fact that these expres-sions are introduced in the intended way. We introduce the relation JF in Figure 4.2to ensure that no join arises inside the body of any let, set, and if. The body ofa join-expression can have an additional join only at the beginning of the body.

The following lemma states the according invariant, i.e. reduction preserves theJF property sequence.

Lemma 4.6 (Reduction Preserves JF ). Let JF(e1) and P ` 〈e1;S1〉 ↪→ 〈e2;S2〉.

47

Page 60: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Figure 4.3 De�nition of null-free-join

null-free-join(x) null-free-join(null) null-free-join(new` c(v))

null-free-join(v.m(v))

null-free-join(e)null-free-join(letx = v.f in e)

null-free-join(e)null-free-join(set v.f = v′ in e)

null-free-join(e) null-free-join(e′)null-free-join(letx = e in e′)

null-free-join(e) null-free-join(e′)null-free-join(if v then e else e′)

null-free-join(e) v′ 6= null

null-free-join(join v = v′.f from e)

Then JF(e2).

Proof. By induction over the reduction relation. Only RAcc introduces a join

expression. Still, as the body of letx = v.f in e has to be join-free, JF(e) holds,too, and therefore JF(join v′ = v.f from e) for the according variables. All othercases hold direct by the induction hypothesis.

The second property is de�ned in Figure 4.3. It states that any introduced join

references a concrete location and not null.Similar to JF, we provide a lemma that states that null-free-join is a valid in-

variant for any reduction step.

Lemma 4.7 (Reduction Preserves null-free-join). Let null-free-join(e1) and P `〈e1;S1〉 ↪→ 〈e2;S2〉. Then null-free-join(e2).

Proof. By induction on the reduction relation. A location, that is once allocatedin the store, may not be removed (i.e. turned to null). Therefore, we only haveto consider rules that introduce a join expression, all existing join expressionstrivially satisfy the invariant, as locations my not turn into null. The only rule tointroduce a join is RAcc. The premise of RAcc states that the according locationl has to exist in the store. Therefore, this invariant holds.

4.3 Coinductive De�nitions

Now we provide the set-based de�nitions of all used coinductive relations of Java(X).As our types are de�ned coinductively, all relations that involve direct access on

48

Page 61: Java(X) A Type-Based Program Analysis Framework

4.3 Coinductive De�nitions

types are coinductive. These set-based de�nitions are built using the schema asde�ned in Section 1.3.2. Therefore, the relations obtained by the de�nitions in thisSection are all equivalent to the relations de�ned by the according inference rules.

We de�ne all relations directly on annotated types t.

Subactive First we de�ne subactivity � as νSA, where

SA(R) = {(〈va, c{fi : 〈aai, ti〉}〉, 〈va, c{fi : 〈aa ′i, t′i〉}〉)| (∀i) aai � aa ′i ∧ (ti, t′i) ∈ R}

Subdroppability Subdroppability ρ� is de�ned as νSD, where

SD(R) = {(〈va, c{fi : 〈aai, ti〉}〉, 〈va, c{fi : 〈aa ′i, t′i〉}〉| (∀i) aai

ρ� aa ′i ∧ (ti, t′i) ∈ R}

Splitting Next, splitting · � · | · is de�ned as νSP, where

SP(R) = {(〈va, c{fi : 〈aai, ti〉}〉, 〈va, c{fi : 〈aa ′i, t′i〉}〉, 〈va, c{fi : 〈aa ′′i , t′′i 〉}〉)| (∀i) aai � aa ′i | aa ′′i ∧ (ti, t′i, t

′′i ) ∈ R}

E�ect Application The e�ect application · := · ↓ · · is provided by νEA, where

EA(R) = {(〈va, c{fi : 〈aai, ti〉}〉,〈va, c{fi : 〈aa ′i, t′i〉}〉,〈va, c{fi : 〈aa ′′i , t′′i 〉}〉,〈va, c{fi : 〈aa ′′′i , t′′′i 〉}〉)| (∀i) aai := aa ′i ↓ aa ′′i aa ′′′i ∧ (ti, t′i, t

′′i , t

′′′i ) ∈ R}

Droppability The droppable relation ρ is given by νDP, where

DP(R) = {〈va, c{fi : 〈aai, ti〉}〉| (∀i) ρ(aai) ∧ ti ∈ R}

Nonactive Further, nonactive O/♦ t is de�ned as νNA, where

NA(R) = {〈va, c{fi : 〈aai, ti〉}〉| (∀i) aai ∈ {O,♦} ∧ ti ∈ R}

49

Page 62: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Equality Additionally, we provide equality on types as νEQ, where

EQ(R) = {(〈va, c{fi : 〈aai, ti〉}〉, 〈va, c{fi : 〈aa ′i, t′i〉}〉)| (∀i) aai = aa ′i ∧ (ti, t′i) ∈ R}

4.4 Basic Properties

Now, as we de�ned all prerequisites, we start by proving some basic properties:Subactivity is transitive, splitting is partial commutative and associative, and �nallythat the above de�ned subdroppable relation preserves droppability. These lemmasalready show that the relations enjoy some expected properties.The proof for transitivity of the subactive relation follows the method we intro-

duced in Section 1.3.2. The main change with respect to the simple example inSection 1.3.2 is, that we have to take all �elds into account. To do so, we �rst provethe transitivity of the subactive relation on activity annotations by case distinction.

Lemma 4.8 (Subactive Transitivity for Activity Annotations). Let aa � aa ′ andaa ′ � aa ′′. Then aa � aa ′′.

Proof. By case distinction.Case distinction over aa.

• Case aa = M(va): By inversion of aa � aa ′ we obtain with SubactAct aa ′ ∈{M(va ′),O,♦}. Either case allows us to conclude with another application of Sub-active and aa ′ � aa ′′ that aa � aa ′′.

• Case aa = O: By SubactInact, aa � aa ′, and aa ′ � aa ′′ we deduce aa ′′ = O,therefore aa � aa ′′ holds.

• Case aa = ♦: Similar, with SubactSemiact and aa � aa ′ we obtain aa ′ ∈ {O,♦}and conclude with SubactInact, SubactSemiact, and aa ′ � aa ′′ that aa � aa ′′

holds.

End case distinction over aa.

Lemma 4.9 (Transitivity of Subactive). Let t � t′ and t′ � t′′. Then t � t′′.

Proof. By coinduction. We de�ne the transitivity for subactivity with

TR = {(t, t′′) | (t, t′) ∈ νSA ∧ (t′, t′′) ∈ νSA}

and prove TR ⊆ SA(TR). Let arbitrary (t, t′′) ∈ TR, then, by de�nition of TR,there exists t′ such that

(t, t′) ∈ νSA (4.1)

(t′, t′′) ∈ νSA (4.2)

50

Page 63: Java(X) A Type-Based Program Analysis Framework

4.4 Basic Properties

This yields by the de�nition of SA and Remark 4.2

aai � aa ′i (4.3)(∀i)(ti, t′i) ∈ νSA (4.4)

aa ′i � aa ′′i (4.5)

(t′i, t′′i ) ∈ νSA (4.6)

Next, by (4.3), (4.5), and Lemma 4.8 we get

aai � aa ′′i (4.7)(∀i)

In addition, with (4.4), (4.6), and the de�nition of TR we get

(ti, t′′i ) ∈ TR (4.8)(∀i)

Finally, with (4.7), (4.8), and SA, we conclude that (t, t′′) ∈ SA(TR) and henceTR ⊆ SA(TR). Therefore, subactivity is a transitive relation.

Corollary 4.10 (Subactive Transitivity for Environments). Let A1 � A2 and A2 �A3. Then A1 � A3

Proof. A direct consequence of SubactEnv and Lemma 4.9.

Lemma 4.11 (Splitting Commutativity). Let t � t′ | t′′. Then t � t′′ | t′.

Proof. We �rst de�ne the commutativity of splitting with

CM = {(t, t′, t′′) | (t, t′′, t′) ∈ νSP}

and prove CM ⊆ SP(CM). Let arbitrary (t, t′, t′′) ∈ CM, then

(t, t′′, t′) ∈ νSP

and therefore by de�nition of SP

aai � aa ′′i | aa ′i (4.9)(∀i)(ti, t′′i , t

′i) ∈ νSP (4.10)

By the de�nition of CM and (4.10) we deduce

(ti, t′i, t′′i ) ∈ CM (4.11)(∀i)

The rules SplitInact and SplitSemiact are obviously commutative. In addition,we obtain from (4.9) by case distinction over all possible inference rules for aai �aa ′′i | aa ′i that

aai � aa ′i | aa ′′i (4.12)

51

Page 64: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

This statement is trivial for rules SplitInact and SplitSemiact, and the rulesSplitActFst and SplitActSnd are their own direct commutative counterparts.

Finally, with (4.11) and (4.12) and the de�nition of νSP we obtain

(t, t′, t′′) ∈ SP(CM)

Therefore CM ⊆ SP(CM), which concludes the proof.

Lemma 4.12 (Splitting Associativity for Activity Annotations).Let aa � aa12 | aa3 and aa12 � aa1 | aa2. Then there exists exact one aa23 such

that aa � aa1 | aa23 and aa23 � aa2 | aa3 hold.

Proof. By case distinction over the inversion of aa � aa12 | aa3. For readability,we use the following labels:

aa � aa12 | aa3 (4.13)

aa12 � aa1 | aa2 (4.14)

aa � aa1 | aa23 (4.15)

aa23 � aa2 | aa3 (4.16)

We omit the value annotation inside active activity annotations, as splitting neverchanges this annotation.

Case distinction over splitting (4.13).

• Case SplitActFst: That is aa = aa12 = M and aa3 = O. Further, by aa12 = Mwe know that (4.14) uses rule SplitActFst or SplitActSnd. The �rst caseyields aa1 = M and aa2 = O. That is, the only aa23 that satisfy (4.15) and (4.16)is aa23 = O.

• Case SplitActSnd: Then aa = aa3 = M and aa12 = O. In addition, via (4.14)we get aa1 = aa2 = O. That is, only aa23 = M satis�es both, (4.15) and (4.16).

• Case SplitInact (SplitSemiact): Then, by (4.13) and (4.14) all according activ-ity annotations are O (♦), thus aa23 = O (♦) is the only possibility to satisfy (4.15)and (4.16).

End case distinction over splitting (4.13).

Lemma 4.13 (Splitting Associativity). Let t � t12 | t3 and t12 � t1 | t2. Then there

exists t23 such that t � t1 | t23 and t23 � t2 | t3 hold.

52

Page 65: Java(X) A Type-Based Program Analysis Framework

4.4 Basic Properties

Proof. We prove this lemma by coinduction. Still, this proof di�ers from the pre-vious coinductive proofs in this thesis. The reason is that we have to prove theexistence of a suitable t23. As soon as we prove the existence of this type, it turnsout that the desired splitting property is easy to show. In fact, we prove that thereexists only one unique t23 that satis�es the splitting as postulated in the lemma.

Let T be the set of all annotated types t. We prove the existence of the uniquet23 with a corecursive function f : T5 → T which takes a 5-tuple of annotated typesand returns an annotated type. In addition, we restrict the domain of f such thatwhenever we call f(t, t12, t1, t2, t3) the following two preconditions hold:

(t, t12, t3) ∈ νSP (4.17)

(t12, t1, t2) ∈ νSP (4.18)

Now, we de�ne f as

f(t, t12, t1, t2, t3) = 〈va, c{f : 〈aa23i , f(ti, t

12i , t

1i , t

2i , t

3i )〉}〉

where for each i, aa23i satis�es both

aai � aa1i | aa23

i (4.19)(∀i)aa23

i � aa2i | aa3

i (4.20)

We show that f is a total, well-de�ned function, that is, for every call of the domainwe get a single result. Let f(t, t12, t1, t2, t3) be an arbitrary call, then by the domainconditions (4.17) and (4.18) and the de�nition of SP we get

aai � aa12i | aa3

i (4.21)(∀i)aa12

i � aa1i | aa2

i (4.22)

Now, with (4.21), (4.22), and Lemma 4.12, we get for every i a unique aa23i that

satis�es (4.19) and (4.20).

Next, we show that the recursive call f(ti, t12i , t1i , t

2i , t

3i ) is correct, i.e. the domain

condition holds. We get the needed splitting

(ti, t12i , t3i ) ∈ νSP

(t12i , t

1i , t

2i ) ∈ νSP

for all i directly by unrolling the de�nition of SP on (4.17) and (4.18). Therefore,f is a well-de�ned corecursive function with respect to its domain restrictions asstated above.

53

Page 66: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

To complete the proof, we have to show that the such obtained t23 satis�es theneeded splitting conditions and is therefore suitable for the lemma. We de�ne

M = {(t, t1, t23), (t23, t2, t3) |(t, t12, t3) ∈ νSP∧ (t12, t1, t2) ∈ νSP∧ t23 = f(t, t12, t1, t2, t3)}

Here, we omit a detailed proof forM ⊆ SP(M) as upon a closer look it is trivial: Byf(t, t12, t1, t2, t3), we already have (4.19) and (4.20) which (separately) is the mainpart in the de�nition of SP and the second part (ti, t1i , t

23i ) ∈M and (t23i , t

2i , t

3i ) ∈M

follows by unrolling the de�nitions and types once. In addition, as f is well de�nedand the call inside the de�nition of M obviously satis�es the domain restrictions,M is proved to be not empty. Therefore, whenever arbitrary (t, t1, t23) ∈ M and(t23, t2, t3) ∈ M , we have (t, t1, t23) ∈ SP(M) and (t23, t2, t3) ∈ SP(M), too, whichyields M ⊆ SP(M).

Lemma 4.14 (Subdroppable preserves Droppability). Let ρ(t′) and t ρ� t′. Then

ρ(t).

Proof. By coinduction. We de�ne

M = {t | t′ ∈ νDP ∧ (t, t′) ∈ νSD}

and prove M ⊆ DP(M). Let arbitrary t ∈ M , then, by de�nition of M , DP, andSD, there exists a t′ such that

ρ(aa ′i) (4.23)(∀i)t′i ∈ νDP (4.24)

aaiρ� aa ′i (4.25)

(ti, t′i) ∈ νSD (4.26)

The de�nition of M yields with (4.24) and (4.26) that

ti ∈M (4.27)(∀i)

Next, we make a case distinction over aa ′i for arbitrary i to show that ρ(aai) holdsfor the according i.Case distinction over aa ′i.

• Case aa ′i = M(va ′i): Then, by (4.23), we have ρ(va ′i). In addition, rule SDAct hasto be used to satisfy (4.25), that is aai = aa ′i and therefore ρ(aai).

• Case aa ′i = O: Then, only SDActDrop or SDInact satisfy (4.25). In either casewe obtain that ρ(aai) holds.

54

Page 67: Java(X) A Type-Based Program Analysis Framework

4.5 Environmental Lemmas

• Case aa ′i = ♦: Via the single rule to obtain (4.25), SDSemiact, we directly getρ(aai).

End case distinction over aa ′i.That is, we have for every i the desired droppability for aai:

ρ(aai) (4.28)(∀i)

Now, we may combine (4.27) and (4.28) with the de�nition of DP and get t ∈DP(M) and therefore, as the choice for t was arbitrary, we get M ⊆ DP(M).

4.5 Environmental Lemmas

Next, we provide two lemmas that show that the environment does not arbitrar-ily change when typing a expression. That is, the domain does not change anduntouched variables do not change either when we type an expression.

Lemma 4.15 (Stable Domain). Let P ;A e e : t � A′. Then dom(A) = dom(A′).

Proof. By induction over the typing relation. Trivial for all cases with applicationof the induction hypothesis and the fact that e�ect application in TCall andTLetVar also does not change the domain of the environment.

The following lemma is in principle extendable to arbitrary expressions. Still, aswe do not need it for our proofs we omit this extension.

Lemma 4.16 (Stable Environment). Let P ;A e y : t � A′. Then ∀x ∈ dom(A) :x 6= y ⇒ A(x) = A′(x)

Proof. This lemma states its claim only over variable typing. Therefore, P ;A e

y : t � A′ is typed with rule TVar which does not change any variable in theenvironment but y. Hence, the lemma is valid.

4.6 Auxiliary Lemmas

Lemma 4.17 (Type Access Correlates to Splitting). Let ty = t′y |fj txj . Then

∃ tx : ty � t′y | tx with tx = t |fj txj and O/♦ t t, and vice versa.

Proof. By inversion of TypeAccess we get according to our naming conventionstyj � t′yj

| txj . To obtain the correct tx we extend txj such that we get a compatibletype for ty. Therefore we only need to �nd suitable annotations for tx We keep thevalue annotations from ty for compatibility. We choose the unset activity annota-tions according to the ones in ty. Whenever the according part in ty is active orinactive, we set this activity annotation to inactive for tx, otherwise they become

55

Page 68: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

semiactive. Thereby txj stays unchanged and therefore may hold active references.Still, as all possible active activity annotations of tx are part of txj we directlyobtain that t, as it is speci�ed in the lemma, satis�es O/♦ t t. tx = t |fj txj holdstrivially, too.

Lemma 4.18 (Weakening). Let P ;A1 e e : t � A′1 and dom(A1) ⊆ dom(A2) and

A A2�dom(A1) = A1. Then P ;A2 e e : t � A′2 with A A′

2�dom(A1) = A′1 and

∀x ∈ dom(A2), x /∈ dom(A1) : A′2(x) = A2(x).

Proof. Trivial, as none of the new variables is used by e. Proof by induction onthe typing derivations and the two basic environment properties Lemma 4.15 andLemma 4.16.

Lemma 4.19 (Activity Transfer). Let t1 � t′1 | tx, t′2 � t2 | tx and t3 � t1 | t2.Then t3 � t′1 | t′2

Proof. By coinduction. We de�ne

M = {(t3, t′1, t′2) | (t1, t′1, tx) ∈ νSP ∧ (t2, t′2, tx) ∈ νSP ∧ (t3, t1, t2) ∈ νSP}

We have to show that M ⊆ SP(M) holds. Let arbitrary (t3, t′1, t′2) ∈M . Then, by

de�nition of M and splitting

(t1i , t′1i, txi) ∈ νSP (4.29)(∀i)

aa1i � aa ′1i| aaxi (4.30)

(t2i , t′2i, txi) ∈ νSP (4.31)

aa2i � aa ′2i| aaxi (4.32)

(t3i , t1i , t2i) ∈ νSP (4.33)

aa3i � aa1i | aa2i (4.34)

First, by de�nition of M and (4.29), (4.31), and (4.33), we get

(t3i , t′1i, t′2i

) ∈M (4.35)(∀i)

Next, we make a case distinction over (4.34) for arbitrary i. As the value annotationsstay unchanged in this lemma, we omit them throughout the proof.Case distinction over aa3i � aa1i | aa2i .

• Case SplitActFst: That is, we have aa1i = M and therefore by (4.30), Spli-tActFst or SplitActSnd, either aa ′1i

= M or aaxi = M. With aa ′1i= M we have

aaxi = O and via (4.32) and aa2i = O that aa ′2i= O, by SplitInact. Therefore,

with SplitActFst, aa3i � aa ′1i| aa ′2i

holds. For the other case, that is aaxi = Mand aa ′1i

= O, we get analog via (4.32) with aa2i = O and SplitActSnd thataa ′2i

= M. Then with aa ′1i= O and SplitActSnd aa3i � aa ′1i

| aa ′2iholds, too.

56

Page 69: Java(X) A Type-Based Program Analysis Framework

4.7 Properties of Droppability

• Case SplitActSnd: We have aa1i = O and aa2i = M, the second yields with (4.32)aaxi = O and aa ′2i

= M. With (4.30), aa1i = O, and SplitInact we get aa ′1i= O.

Therefore, aa3i � aa ′1i| aa ′2i

holds.

• Case SplitInact: With aa1i = aa2i = aa3i = O we �rst obtain by (4.30) aaxi = O,then (4.32) yields aa ′2i

= O, both with SplitInact. Then, aa3i � aa ′1i| aa ′2i

holdswith SplitInact, too.

• Case SplitSemiact: Analogue to SplitInact.

End case distinction over aa3i � aa1i | aa2i . That is, we have

aa3i � aa ′1i| aa ′2i

(∀i)

Together with the de�nition of SP and (4.35) we obtain (t3, t′1, t′2) ∈ SP(M) and

therefore M ⊆ SP(M), which concludes the proof.

4.7 Properties of Droppability

This Section provides several lemmas in the context of droppability and the sub-droppable relation.

Lemma 4.20 (Split o� Droppable Extension). Let t ρ� t′. Then there exits some

tρ such that t � t′ | tρ and ρ(tρ).

Proof. This proof is similar to the one for splitting associativity and has the samemain issue compared to the other coinductive proofs: The main obstacle is to provethe existence of a suitable tρ, the splitting property is then easy to show.

Let T be the set of all annotated types t. We de�ne a well-de�ned corecursivefunction f : T2 → T to obtain a suitable tρ. We restrict the domain of f to tupleswhich satisfy

(t, t′) ∈ νDP (4.36)

For any tuple in this domain, f is de�ned as

f(t, t′) = 〈va, c{f : 〈aaρi , f(ti, t′i)〉}〉

where for every i, aaρi satis�es

ρ(aaρi) (4.37)(∀i)aai � aa ′i | aaρi (4.38)

57

Page 70: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Thereby, va is drawn from t, according to our naming conventions. Now we provethat f is well-de�ned for every call that satis�es the domain restriction. Let f(t, t′)be an arbitrary correct call, then we have by (4.36) and the de�nition of DP

aaiρ� aa ′i (4.39)(∀i)

We make a case distinction for arbitrary i over aai to show that there exists exactlyone aaρi that satis�es (4.37) and (4.38) for this call of f.Case distinction over aai.

• Case aai = M(vai): Then by (4.39), either aa ′i = aai = M(vai) or aa ′i = O withρ(vai) and therefore ρ(aai). The �rst case yields that only aaρi = O satis�es (4.37)and (4.38). For the second case, we solely have aaρi = M(vai) to satisfy aai � aa ′i |aaρi , droppability holds by ρ(vai).

• Case aai = O (aai = ♦): Only aaρi = aai = aa ′i = O (♦) satis�es (4.39), (4.37),and (4.38).

End case distinction over aai.That is, we have for every i exactly one aaρi that satis�es (4.37) and (4.38).Next, we show that we call f within the corecursive de�nition only with tuples

that ful�ll the domain condition. By unrolling the de�nition of DP for (4.36) weobtain (∀i) (ti, t′i) ∈ νDP, which is all we need.Therefore, f is a well-de�ned corecursive function. It remains to show that the

such obtained type satis�es splitting and droppability as stated in the lemma.Again, similar to the proof for splitting associativity, we omit a detailed proofas this is a direct consequence of the de�nition of f.

Lemma 4.21 (Join Droppable Extension). Let t � t′ | tρ and ρ(tρ). Then t ρ� t′.

Proof. By coinduction. We de�ne

M = {(t, t′) | (t, t′, tρ) ∈ νSP ∧ tρ ∈ νDP}

and prove M ⊆ SD(M). Let arbitrary (t, t′) ∈ M , then we have by de�nition ofM , SP, and DP

aai � aa ′i | aaρi (4.40)(∀i)(ti, t′i, tρi) ∈ νSP (4.41)

ρ(aaρi) (4.42)

tρi ∈ νDP (4.43)

This time, we analyze the rule to obtain (4.40) for arbitrary i.Case distinction over splitting (4.40).

58

Page 71: Java(X) A Type-Based Program Analysis Framework

4.7 Properties of Droppability

• Case SplitActFst, SplitInact, or SplitSemiact: Then aai = aa ′i, thereforeaai

ρ� aa ′i holds.

• Case SplitActSnd: Then aai = aa ′ρi, with (4.42) we get ρ(aai) and therefore

aaiρ� aa ′i holds.

End case distinction over splitting (4.40).That is, we have

aaiρ� aa ′i(∀i)

In addition, (4.41) and (4.43) and the de�nition of M yield that

(ti, t′i) ∈M(∀i)

If we combine this with the de�nition of DP we conclude that (t, t′) ∈ SD(M).Therefore, M ⊆ SD(M) holds.

Lemma 4.22 (Extend with Droppable Capabilities). Let t1 � t2 | t3 and t0ρ� t1.

Then there exists t4 such that t0 � t4 | t3 and t4ρ� t2.

Proof. By coinduction. Similar to the proof for the splitting associativity, we �rstprove the existence, given t0, t1, t2, and t3 as stated in the lemma, of an unique t4that satis�es t0 � t4 | t3. To this end we provide a well-de�ned corecursive functionf : T4 → T, where T is the set of all annotated types t. The domain of f(t0, t1, t2, t3)for arbitrary t0 . . . t3 is, according to the preconditions of the lemma, restricted totypes which satisfy

(t1, t2, t3) ∈ νSP (4.44)

(t0, t1) ∈ νSD (4.45)

Now we de�ne

f(t0, t1, t2, t3) = 〈va0, c0{f0i : 〈aa4i , f(t0i , t1i , t2i , t3i)〉}〉

Where, aa4i satis�es

aa0i � aa4i | aa3i (4.46)(∀i)

We prove that for every correct call f is well-de�ned. Let arbitrary t0, t1, t2, andt3 satisfy (4.44) and (4.45). First, with the de�nition of νSP and νSD we get

(t1i , t2i , t3i) ∈ νSP (4.47)(∀i)aa1i � aa2i | aa3i (4.48)

(t0i , t1i) ∈ νSD (4.49)

aa0iρ� aa1i (4.50)

59

Page 72: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Therefore, by (4.47) and (4.49) the recursive call f(t0i , t1i , t2i , t3i) satis�es the do-main condition. For the uniqueness of aa4i for arbitrary i, we make a case distinc-tion over aa0i . To avoid the same case distinction a second time we additionallyshow that aa4i

ρ� aa2i holds for this activity annotation.Case distinction over aa0i .

• Case aa0i = M(va0i): Then we have two possibilities for aa1i .Case distinction over aa0i

ρ� aa1i .

� Case SDAct: Then aa1i = M(va0i) and therefore the unrolled part for (4.44) hasto use rule SplitActFst or SplitActSnd. Both cases yield aa4i = aa2i as theonly possibility to satisfy (4.46). As aa4i = aa2i we directly obtain aa4i

ρ� aa2i .

� Case SDActDrop: Then aa1i = O and va0i ∈ ρ. By (4.48) we obtain withSplitInact aa3i = O. Now, to satisfy (4.48), we need aa4i = aa0i , which satis-�es (4.46). In addition, we have by (4.48) aa2i = O. Then we get with va0i ∈ ρ bySDActDrop aa4i

ρ� aa2i .

End case distinction over aa0iρ� aa1i .

• Case aa0i = O or aa0i = ♦: Then by (4.50) aa1i = aa0i , this yields with (4.48)aa3i = aa0i and then with SplitInact or SplitSemiact aa4i = aa0i . In addition,we similar obtain aa4i = aa2i and therefore aa4i

ρ� aa2i .

End case distinction over aa0i .Therefore, we have for every i an unambiguous aa4i with the additional propertyaa4i

ρ� aa2i .Now we de�ne

M1 = {(t0, t4, t3) | t4 = f(t0, t1, t2, t3)}M2 = {(t4, t2) | t4 = f(t0, t1, t2, t3)}

and prove M1 ⊆ SP(M1) and M2 ⊆ SD(M2). We omit the details as this proof isequivalent to the other proofs and all steps are already provided above.

4.8 Typing Consumes Activities

One of the most important properties of the type system is that typing consumesactivities. That is, whenever P ;A e e : t � A′ then A′ does not have more capa-bilities than A. Before proving this statement we extract several parts and provethem separately. In that way we show that splitting itself consumes capabilities,and more sophisticated that e�ect application, even for multiple application, doesnot generate activities, either.

60

Page 73: Java(X) A Type-Based Program Analysis Framework

4.8 Typing Consumes Activities

Lemma 4.23 (Splitting Activities Consumes Activities). Let aa � aa ′ | aa ′′. Thenaa � aa ′ and aa � aa ′′.

Proof. We solely prove aa � aa ′, the second part aa � aa ′′ follows analogously.Case distinction over aa � aa ′ | aa ′′.

• Case SplitActFst: Then, by SubactAct aa � aa ′.

• Case SplitActSnd: Then, by SubactAct aa � aa ′.

• Case SplitSemiact: Then, by SubactSemiact aa � aa ′.

• Case SplitInact: Then, by SubactInact aa � aa ′.

End case distinction over aa � aa ′ | aa ′′.

Lemma 4.24 (Splitting Consumes Activities). Let t � t1 | t2. Then t � t1 and

t � t2.

Proof. By coinduction. We �rst prove t � t1 and de�ne M as

M = {(t, t′) | (t, t′, t′′) ∈ νSP}

We prove M ⊆ SA(M). If (t, t′) ∈M , we know by (t, t′, t′′) ∈ νSP

aai � aa ′i | aa ′′i (4.52)(∀i)(ti, t′i, t

′′i ) ∈ νSP

Via (ti, t′i, t′′i ) ∈ νSP and the de�nition of M we get

(ti, t′i) ∈M (4.53)

Now, with multiple application of Lemma 4.23 and (4.52) we have (∀i) aai � aa ′iand conclude with (4.53) that (t, t′) ∈ SA(M) and therefore M ⊆ SA(M). Thelemma �nally follows with commutativity of the splitting relation (Lemma 4.11).

Lemma 4.25 (Single E�ect Application in TCall Consumes Activities). Let t :=t′ ↓ t1 t′1 and t′ � t1. Then t′ � t.

Proof. By coinduction. We de�ne

M = {(t′, t)|(t, t′, t1, t′1) ∈ νEA ∧ (t′, t1) ∈ νSA}

61

Page 74: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

and prove M ⊆ SA(M). Let arbitrary (t′, t) ∈ M , then there exists (t, t′, t1, t′1) ∈νEA and (t′, t1) ∈ νSA and we have

aai := aa ′i ↓ aa1i aa ′1i(4.54)(∀i)

(ti, t′i, t1i , t′1i

) ∈ νEA (4.55)

aa ′i � aa1i (4.56)

(t′i, t1i) ∈ νSA (4.57)

Inversion of the e�ect applications (4.54) yields for arbitrary i two possibilities:

Case distinction.

• Case EAActive: Then aa1i = M(va1i) which yields with (4.56) and the de�nitionof Subactive, here SubactAct, aa ′i = M(va′i), and therefore, again by Subac-

tAct, aa ′i � aai.

• Case EANonactive: Then we have aa ′i = aai which directly yields aa ′i � aai.

End case distinction.Therefore, we have

aa ′i � aai (4.58)(∀i)

In addition, by (4.55) and (4.57) and the de�nition of M we get

(t′i, ti) ∈M (4.59)(∀i)

Finally, with (4.58) and (4.59) we deduce (t′, t) ∈ SA(M) and therefore M ⊆SA(M).

Theorem 4.26 (Multiple E�ect Application Consumes Activities).Let for n ∈ N, n even

t0 � t1 | t2 (S0)

t2 � t3 | t4 (S2)

...

tk � tk+1 | tk+2 (Sk)

tk+2 � tk+3 | tk+4 (Sk+2)

...

tn � tn+1 | tn+2 (Sn)

62

Page 75: Java(X) A Type-Based Program Analysis Framework

4.8 Typing Consumes Activities

t2 := t0 ↓ t1 t1 (EA0)and

...

tk+2 := tk ↓ tk+1 tk+1 (EAk)

tk+4 := tk+2 ↓ tk+3 tk+3 (EAk+2)

...

tn+2 := tn ↓ tn+1 tn+1 (EAn)

and t0 = t0. Then t0 � tn+2.

Proof. We prove this lemma by induction on n and prove every step inside thisinduction by coinduction. As a convention for this proof, k is always restricted toeven values. In addition, to avoid cluttering of the proof, we omit any value an-notations inside activity annotations as splitting leaves this annotation unchangedand for subactivity these annotations do not matter. Finally, we strictly use thevariable naming of the lemma throughout the proof.We �rst reduce the lemma and solely prove (∀k ≤ n) tk � tk+2, the desired

t0 � tn+2 then holds by multiple application of transitivity, Lemma 4.9. Further,we only prove that (∀k ≤ n) tk � tk+1 and tk � tk+2 holds via the induction. Thissu�ces, as with (EAk) and Lemma 4.25, which shows that single e�ect applicationconsumes activities, we get tk � tk+2 for every k ≤ n (always given the preconditionsof this lemma hold).As we have to use coinduction inside our induction to show that the above given

subactivities hold, we de�ne setsMk similar to the proof schema of the other proofs.Obviously, we need a new set for every k ≤ n. The de�nition of Mk uses the abovede�ned equation Sk and EAk instead of νSP and νEA, including the variablenaming, to increase the readability.

M0 = {(t0, t1), (t0, t2) | S0 ∧ EA0 ∧ t0 = t0}(∀k ∈ {2..n})

Mk = {(tk, tk+1), (tk, tk+2) | S0 ∧ · · · ∧ Sk ∧ EA0 ∧ · · · ∧ EAk ∧ t0 = t0}

We have to prove Mn ⊆ SA(Mn), which we will do by induction on n. We are ableto do so, as Mk only adds constraints compared to Mk−1.

Induction Hypothesis As stated above, for even k, our induction hypothesis is

tk � tk+1 (4.60)(∀k ≤ n)tk � tk+2 (4.61)

63

Page 76: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Base Case For any (t0, t1) ∈M0 we have t0 = t0 and S0. We apply Lemma 4.24with S0 and obtain t0 � t1 and t0 � t2. With t0 = t0 this yields the desired t0 � t1and t0 � t2 and therefore M0 ⊆ SA(M0) holds.

Induction Step for Mn+2 We have to show Mn+2 ⊆ SA(Mn+2) and may use, byinduction, that Mn ⊆ SA(Mn) holds.Let arbitrary (tn+2, tn+3) ∈ Mn+2 and (tn+2, tn+4) ∈ Mn+2. Then we have, by

de�nition of Mn+2, e�ect application, and splitting that there exist variables suchthat

aaki � aak+1

i | aak+2i (Sk

i )(∀k ∈ {0..n+ 2}), (∀i)tki � tk+1

i | tk+2i (4.62)

aak+2i := aak

i ↓ aak+1i aak+1

i (EAki )

tk+2i := tki ↓ tk+1

i tk+1i (4.63)

First, with (4.62), (4.63), and t0i = t0i and by de�nition of Mk, including k = n+ 2

(tki , tk+1i ) ∈Mk (4.64)(∀k ∈ {0..n+ 2}), (∀i)

(tki , tk+2i ) ∈Mk (4.65)

The induction hypothesis (tk � tk+1 and tk � tk+2 for k ≤ n) yields with thede�nition of SA

aaki � aak+1

i (4.66)(∀k ∈ {0..n}), (∀i)aak

i � aak+2i (4.67)

Inversion of (EAn+2i ) yields for arbitrary i two possibilities:

Case distinction Inversion of aan+4i := aan+2

i ↓ aan+3i aan+3

i .

• Case EAActive: Then aan+3i = M. With (Sn+2

i ), aan+2i � aan+3

i | aan+4i , and

inversion of splitting we get

aan+2i = M (4.68)

aan+4i = O (4.69)

This yields with (Sni ), aan

i � aan+1i | aan+2

i , again with inversion of splitting, thataan

i = M and aan+1i = O. Now we deduce by inversion of (EAn

i ), aan+2i := aan

i ↓aan+1

i aan+1i , with EANonactive, that

aan+2i = aan

i (4.70)

64

Page 77: Java(X) A Type-Based Program Analysis Framework

4.8 Typing Consumes Activities

Now, by inversion of the subactivity (4.67) of the induction hypothesis for k = nand (4.68) we get aan

i = M. This yields with (4.70) that aan+2i = M, too, and

therefore in any case

aan+2i � aan+3

i

The second part of the invariant is trivial, as any activity annotation is subactiveto O. Therefore, we directly get with (4.69)

aan+2i � aan+4

i

• Case EANonactive: Then aan+3i ∈ {O,♦}, which yields with (Sn+2

i ), aan+2i �

aan+3i | aan+4

i , and inversion of splitting aan+2i = aan+4

i , either via SplitInact,SplitSemiact, or SplitActSnd. We make a case distinction over aan+4

i , whichcovers both cases for aan+3

i , too.Case distinction over aan+4

i .

� Case M: Then, to satisfy splitting (Sn+2i ), we get aan+3

i = O and therefore, similarto above, trivially for any aan+2

i

aan+2i � aan+3

i

In addition splitting (Sn+2i ) also yields aan+2

i = M and therefore by the inductionhypothesis for k = n (4.67), aan

i � aan+2i and inversion of subactive we get aan

i = M,too. Further, aan+2

i = M yields by (Sni ) that aan+1

i = O, which again yields withinversion of (EAn

i ), aan+2i := aan

i ↓ aan+1i aan+1

i , that aan+2i = aan. If we

combine this equality wit aani = M we get aan+2

i = M, and therefore as above

aan+2i � aan+4

i

� Case O: Then we derive by aan+3i ∈ {O,♦} with splitting (Sn+2

i ) that aan+3i = O

which again yields

aan+2i � aan+3

i

And as aan+4i = O, the second invariant holds, too:

aan+2i � aan+4

i

� Case ♦: As above via splitting (Sn+2i ) we derive aan+3

i = ♦ and aan+2i = ♦, and

further with (Sni ) that aan+1

i = ♦, too. Then we get by (EAni ) with EANonac-

tive, aan+2i = aan. As the induction hypothesis provides aan

i � aan+1i and we

have aan+2i = aan

i and aan+3i = aan+1

i we also get

aan+2i � aan+3

i

65

Page 78: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

and analog with aan+1i = aan+4

i

aan+2i � aan+4

i

End case distinction over aan+4i .

End case distinction Inversion of aan+4i := aan+2

i ↓ aan+3i aan+3

i .We sum these case distinctions up and get for every case aan+2

i � aan+3i and

aan+2i � aan+4

i . Therefore, combined with (4.66) and (4.67), we get

aaki � aak+1

i(∀k ∈ {0..n+ 2}), (∀i)aak

i � aak+2i

This yields together with the de�nition of SA, (4.64) and (4.65) that (tn+2, tn+3) ∈SA(Mn+2) and (tn+2, tn+4) ∈ SA(Mn+2). Therefore, Mn+2 ⊆ SA(Mn+2) holds.This concludes the induction. We proved that, given (Sk) and (EAk) hold for

k ∈ {0..n} and t0 = t0

tk � tk+1∀k ≤ ntk � tk+2

both hold. As mentioned in the introduction of this proof, this su�ces as fromtk � tk+1 and (EAk) with Lemma 4.25 we get tk � tk+2, and the via multipleapplication of transitivity, Lemma 4.9, t0 � tn+2. As we additionally have t0 = t0

we �nally conclude t0 � tn+2. The second part, tk � tk+2, that we included in theinduction, is only needed for proof-tactical reasons inside the coinductions.

Lemma 4.27 (E�ect Application in TLetVar Consumes Activities). Let t0 � t1 |t′1, t1 � t2, t′1 � t′2, and t3 := t′2 ↓ t1 t2. Then t0 � t3.

Proof. By coinduction. We de�ne

M = {(t0, t3)|(t0, t1, t′1) ∈ νSP ∧(t1, t2) ∈ νSA ∧(t′1, t

′2) ∈ νSA ∧

(t3, t′2, t1, t2) ∈ νEA}

and prove M ⊆ SA(M).Let arbitrary (t0, t3) ∈ M , then there exists (t0, t1, t′1) ∈ νSP and (t3, t′2, t1, t2) ∈νEA. Therefore,

aa0i � aa1i | aa ′1i(4.71)(∀i)

(t0i , t1i , t′1i

) ∈ νSP (4.72)

aa3i:= aa ′2i

↓ aa1i aa2i (4.73)

(t3i , t′2i, t1i , t2i) ∈ νEA (4.74)

66

Page 79: Java(X) A Type-Based Program Analysis Framework

4.8 Typing Consumes Activities

In addition, (t1, t2) ∈ νSA and (t′1, t′2) ∈ νSA, which yields

aa1i � aa2i (4.75)(∀i)(t1i , t2i) ∈ νSA (4.76)

aa ′1i� aa ′2i

(4.77)

(t′1i, t′2i

) ∈ νSA (4.78)

Inversion of the e�ect applications (4.73) yields two possibilities:Case distinction.

• Case EAActive: Then aa3i:= aa ′2i

↓ M(va1i) aa2i with aa3i = aa2i . Now,with (4.71) and Lemma 4.23 we obtain aa0i � aa1i , which yields together with (4.75)and transitivity, Lemma 4.8, aa0i � aa3i .

• Case EANonactive: Then we have aa3i = aa ′2i. Again, with (4.71) and Lemma 4.23

we obtain aa0i � aa1i , which yields with (4.77) and transitivity aa0i � aa3i .

End case distinction.Therefore, no matter which rule is used for the di�erent i, we get

aa0i � aa3i (4.79)(∀i)

In addition, with (4.72), (4.74), (4.76), (4.78), and the de�nition of M we obtain

(t0i , t3i) ∈M (4.80)(∀i)

Finally, by the de�nition of SA we have with (4.79) and (4.80)

(t0, t3) ∈ SA(M)

Therefore, M ⊆ SA(M).

Lemma 4.28 (Update Consumes Activities). Let u ` f ← t � u′. Then u � u′.

Proof. If (c{fi : si} ` fj ← 〈va, u〉 � c{fj : s′jfi : sii6=j}) then, by inversion, either

rule TypeUpdateAct or rule TypeUpdateSemiact is used. For the later casewe know that u = u′ and therefore the claim u � u′ holds. If the update uses ruleTypeUpdateAct, we know M s sj . That is, all involved activity annotations areactive. Via SubactAct we know that an active activity annotation is subactive toany other activity annotation. We follow, that for every matching and well-formeds′j we get with TypeUpdateAct and SubactAct that u � u′ holds.

Lemma 4.29 (Splitting Propagates Subactive). Let t1 � t2 | t3, t′1 � t′2 | t′3, t2 � t′2,and t3 � t′3. Then t1 � t′1

67

Page 80: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Proof. By coinduction. We de�ne

M = {(t1, t′1)|(t1, t2, t3) ∈ νSP ∧ (t′1, t′2, t

′3) ∈ νSP∧

(t2, t′2) ∈ νSA ∧ (t3, t′3) ∈ νSA}

Similar to the proofs above we show M ⊆ SA(M). Let arbitrary (t1, t′1) ∈M , thenthere exists (t1, t2, t3) ∈ νSP and (t′1, t

′2, t

′3) ∈ νSP which yields with the de�nition

of SP

aa1i � aa2i | aa3i (4.81)(∀i)(t1i , t2i , t3i) ∈ νSP (4.82)

aa ′1i� aa ′2i

| aa ′3i(4.83)

(t′1i, t′2i

, t′3i) ∈ νSP (4.84)

In addition we have (t2, t′2) ∈ νSA and (t3, t′3) ∈ νSA, and therefore by the de�nitionof SA

aa2i � aa ′2i(4.85)(∀i)

(t2i , t′2i

) ∈ νSA (4.86)

aa3i � aa ′3i(4.87)

(t3i , t′3i

) ∈ νSA (4.88)

We use a case distinction over (4.81) to show aa1i � aa ′1ifor arbitrary i.

Case distinction over (4.81), aa1i � aa2i | aa3i .

• Case SplitActFst: That is, aa1i = aa2i = M(va1i). Then we know by Subac-tAct that aa1i � aa ′1i

holds for any aa ′1i.

• Case SplitActSnd: Analogue to SplitActFst.

• Case SplitSemiact: This yields aa2i = aa3i = ♦. Despite the four possibilitieswe have for the use of SubactSemiact to get aa ′2i

and aa ′3i, only two of them

lead to a possible splitting for (4.83): Either aa ′2i= aa ′3i

= ♦ with SplitSemiactand aa ′1i

= ♦ or aa ′2i= aa ′3i

= O with SplitInact and aa ′1i= O. Both satisfy

aa1i � aa ′1iby SubactSemiact.

• Case SplitInact: Here, as aa2i = aa3i = O, by rule SubactInact, both, aa ′2i

and aa ′3ihave to be inactive, too. Therefore, (4.83) has to use rule SplitInact,

which yields aa ′1i= O and aa1i � aa ′1i

holds by SubactInact.

End case distinction over (4.81), aa1i � aa2i | aa3i .Therefore, we have

(∀i) aa1i � aa ′1i(4.89)

68

Page 81: Java(X) A Type-Based Program Analysis Framework

4.8 Typing Consumes Activities

In addition, by (4.82), (4.84), (4.86), and (4.88) and the de�nition of M , we get

(∀i) (t1i , t′1i

) ∈M (4.90)

Finally, by (4.89) and (4.90) and the de�nition of SA we know (t1, t′1) ∈ SA(M)and therefore M ⊆ SA(M). This proves that M is SA-consistent and the lemmaholds.

Lemma 4.30 (Type Access Propagates Subactive). Let t1 = t2 |fjt3, t

′1 = t′2 |fj

t′3,t2 � t′2, and t3 � t′3. Then t1 � t′1

Proof. Let t1 = 〈va, c{fi : si}〉, t2 = 〈va, c{fj : s′j ; fi : si}〉, sj = 〈aa, t1j 〉, ands′j = 〈aa, t2j 〉. Then, by inversion of TypeAccess, t t1j � t2j | t3. Similar wederive from t′1 = t′2 |fj

t′3 that t t′1j� t′2j

| t′3. In addition, simple inversion of Sub-

actType yields together with t2 � t′2 that t2j � t′2jholds. This enables us, with

t3 � t′3, to use Lemma 4.29, which yields t1j � t′1j. Together with SubactType

we derive t1 � t′1.

Lemma 4.31 (Variable Typing Implies Splitting). Let P ;A, x : t1 e x : t2 � A, x :t3. Then t1 � t2 | t3.

Proof. This lemma is a simple application of inversion of TVar. No other typingjudgment is able to type a single variable.

Theorem 4.32 (Typing Consumes Activities). Let P ;A e e : t � A′. Then A � A′.

Proof. We prove this property by induction over the expression typing judgments.Variables that are not touched by e stay unchanged (Lemma 4.16). Therefore, weonly take into account the variables that are used by e.

Case distinction over the derivation of P ;A e e : t � A′.

• Case TVar: Here, we have e ≡ x and therefore P ;A, x : t′ e x : t � A, x : t′′ witht′ � t | t′′. By Lemma 4.31 and Lemma 4.24 we get t′ � t′′ and therefore the claimholds.

• Case TNull: Trivial, as the environment stays unchanged.

• Case TNew: By induction hypothesis.

• Case TAcc: With P ;A, y : ty e letx = y.f in e : t � A′, y : t′′′y we have

ty = t′y |f tx (4.91)

t′′′y = t′′y |f t′x (4.92)

P ;A, y : t′y, x : tx e e : t � A′, y : t′′y, x : t′x (4.93)

69

Page 82: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

With the induction hypothesis applied to (4.93) we get A � A′, tx � t′x, and t′y � t′′y .Finally, ty � t′′′y follows by application of Lemma 4.30 to these conclusions, (4.91),and (4.92).

• Case TSet: With P ;A e setx.f = v in e : t � A′′ we have

P ;A e v : t � A′, x : 〈va, u〉u ` f ← t � u′ (4.94)

P ;A′, x : 〈va, u′〉 e e : t′ � A′′

We get the claim, which is here A � A′′, by double application of the inductionhypothesis. It remains to show 〈va, u〉 � 〈va, u′〉, which follows by Lemma 4.28with (4.94) and the de�nition of subactivity, which ignores the value annotations.

• Case TCall: we have e ≡ v1.m(vii∈{2,...n}). By the type judgment we get P ;A e

vi : ti � A1. The main problem for this case is, that several vi may be identical.Let vk be a variable that is only used once, then we know by Lemma 4.16 thatA(vk) � tk | A1(vk), which yields with Lemma 4.24 that A(vk) � tk holds. Inversionof the e�ect application yields exact one e�ect application on the annotated typesfor A(vk), therefore we may use Lemma 4.25 for this variable.

Whenever a variable is used more than once in a method call, as parameter or asthe object which calls the method, there are several e�ect applications applied tothis variable. Still, the e�ect application takes care of this case and collects all theremaining or changed capabilities of every alias that is split of. Prior to the call,the capabilities are split by P ;A e vi : ti � A1. Let J be an index set for identicalvariables of this call, that is (∀j, k ∈ J)vj = vk. Then we chose the smallest indexj ∈ J and de�ne t0 = A(vj). To obtain readable variable names we map theindex set J to even numbers such that we get even indexes l with l ∈ {0..n} wheren = 2 ∗ |J | − 2. Now we get by TCall the following splitting relations

t0 � t1 | t2l even, l ∈ {0..n}...

tl � tl+1 | tl+2

...

tn � tn+1 | tn+2

The method call, and therefore later on the e�ect application, gets the di�erenttl for typing in the type judgment. We get the e�ect application from mtype anddenote them, according to the above variable naming, with tl+1 tl+1, again for

70

Page 83: Java(X) A Type-Based Program Analysis Framework

4.9 Joining Variables

even l only. That is, if we plug these e�ect applications into TCall and invert itwith EAVar, we obtain with t1 = t1

t2 := t0 ↓ t1 t1

...

tl+2 := tl ↓ tl+1 tl+1

...

tn+2 := tn ↓ tn+1 tn+1

Given these properties, Theorem 4.26 proves that TCall consumes activities, too.

• Case TLetExp: By induction hypothesis, transitivity, and Corollary 4.10.

• Case TLetVar: Inversion of TLetVar for P ;A e letx = v in e : t � A′ yieldsP ;A e v : t1 � A1, we de�ne A(v) = t0 and A1(v) = t′1, then we derive withLemma 4.31 t0 � t1 | t′1. Inversion of TLetVar also yields P ;A1, x : t1 e e :t � A2, x : t2, with A2(v) = t′2 and the induction hypothesis we deduce t1 � t′1 andt2 � t′2. Inversion of EAVar for A A′ := A2 ↓ v : t1 t2 provides A′(v) := t′2 ↓t1 t2. The claim then follows with Lemma 4.27.

• Case TCond: By induction hypothesis.

• Case TAccNull: By induction hypothesis.

• Case TJoin: With P ;A, z : tz, y : ty e join z = y.f from e : t�A′, z : t′′′z , y : t′′′y weobtain with TJoin and the induction hypothesis A � A′, tz � t′′′z , and t

′y � t′′y . It

remains to show that ty � t′′′y holds, which follows from ty = t′y |f tx, t′′′y = t′′y |f t′x,tx � t′x, and t′y � t′′y combined with Lemma 4.30.

• Case TJoinNull: We derive the claim similar to the proof of TJoin.

• Case TSetNull: By induction hypothesis.

End case distinction over the derivation of P ;A e e : t � A′.

4.9 Joining Variables

The next big step towards type preservation is to prove that we may join suitablevariables and therefore their capabilities such that typing is still possible. Thistheorem enables us to show that lending of capabilities and later rejoining themis possible and does not violate type preservation. Again, we �rst prove severallemmas to break down the proof into smaller pieces.

Lemma 4.33 (Join Variables for TVar). Let P ;A,w : tw, x : tx e w : t � A′, w :t′w, x : t′x and tc � tw | tx. Then P ;A, c : tc e c : t � A′, c : t′c with t′c � t′w | t′x.

71

Page 84: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Proof. Provided the prerequisites for the lemma hold, we know by Lemma 4.16 thattx = t′x holds. By inversion of TVar we get tw � t | t′w. Now, with tc � tw | txand Lemma 4.13 we obtain some t′c with tc � t | t′c and t′c � t′w | tx. These splittingproperties enable us to derive P ;A, c : tc e c : t � A′, c : t′c. Finally, t′c � t′w | t′xfollows with tx = t′x.

The following lemma, whose variable naming we already use, states the joinproperty for e�ect application. Basically we want to join tx and tw and apply itwith the e�ect application (the fourth precondition) to obtain t′c := tc ↓ tj t′j withthe according splitting. Unfortunately we need some more preconditions, which westate in the �rst and second precondition for the lemma. The later use of thislemma justi�es this choice. In fact, the second precondition is implied by the otherones, as we may obtain it via splitting associativity, we keep it here to clarify theproof.

Lemma 4.34 (Join for E�ect Application in TCall).

tx � tj | t′′Let

tc � tj | t′

tc � tx | twt′x := tx ↓ tj t′j

Then t′c := tc ↓ tj t′j with t′c � t′x | tw.

Proof. First we build t′c as stated in the lemma with the e�ect application, whichis well-de�ned (Section 4.1.4). It remains to show that t′c � t′x | tw holds, which wedo by coinduction. We de�ne

M = {(t′c, t′x, tw) |(tx, tj , t′′) ∈ νSP∧ (tc, tj , t′) ∈ νSP∧ (tc, tx, tw) ∈ νSP∧ (t′x, tx, tj , t

′j) ∈ νEA

∧ (t′c, tc, tj , t′j) ∈ νEA}

and prove M ⊆ SP(M). By the de�nition of M , SP, and EA, if an arbitrary

72

Page 85: Java(X) A Type-Based Program Analysis Framework

4.9 Joining Variables

(t′c, t′x, tw) ∈M , we get (in the same order as the de�nition of M)

aaxi � aaji | aa ′′i (4.95)(∀i)(txi , tji , t

′′i ) ∈ νSP

aaci � aaji | aa ′i (4.96)

(tci , tji , t′i) ∈ νSP

aaci � aaxi | aawi (4.97)

(tci , txi , twi) ∈ νSPaa ′xi

:= aaxi ↓ aaji aa ′ji(4.98)

(t′xi, txi , tji , t

′ji) ∈ νEA

aa ′ci:= aaci ↓ aaji aa ′ji

(4.99)

(t′ci, tci , tji , t

′ji) ∈ νEA

If we combine every second (unlabeled) line, we get by the de�nition of M that

(t′ci, t′xi

, twi) ∈M (4.100)(∀i)

To obtain aa ′ci� aa ′xi

| aawi we make a case distinction over aaji for arbitrary i,as this determines which rule is used for the e�ect applications (4.98) and (4.99).

Case distinction over aaji .

• Case aaji = M(vaji): Then, by (4.95) aaxi = M(vaji) and by (4.96) aaci = M(vaji),which yields with (4.97) aawi = O. In this case, all e�ect applications use EAAc-tive, therefore aa ′ji

= O or aa ′ji= M(va ′ji

). In either case, as EAActive yieldsaa ′xi

= aa ′ji= aa ′ci

, splitting aa ′ci� aa ′xi

| aawi with aawi = O is possible andcorrect.

• Case aaji = O or aaji = ♦: Then we have via EANonactive that aa ′xi= aaxi

and aa ′ci= aaci , which yields with (4.97) directly that aa ′ci

� aa ′xi| aawi holds.

End case distinction over aaji .That is, we have

aa ′ci� aa ′xi

| aawi(∀i)

Which yields with (4.100) and the de�nition of SP that (t′c, t′x, tw) ∈ SP(M). That

is, we have M ⊆ SP(M), which concludes the proof.

The next lemma is similar to the above one. Unfortunately we need di�erentpreconditions as we are not able to provide the above ones for TLetVar.

73

Page 86: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Lemma 4.35 (Join for E�ect Application in TLetVar).

t2c � t2a | t′bLet

t′a := t2a ↓ t1 t2

tb � t′btc � ta | tbta � t1 | t1a

Then t′c := t2c ↓ t1 t2 with t′c � t′a | t′b.

Proof. As e�ect application is well-de�ned (Section 4.1.4), we �rst construct theaccording t′c. It remains to prove that t′c � t′a | t′b holds. We de�ne

M = {(t′c, t′a, t′b) |(tc, ta, tb) ∈ νSP∧ (t2c , t

2a, t

′b) ∈ νSP

∧ (ta, t1, t1a) ∈ νSP∧ (tb, t′b) ∈ SA

∧ (t′a, t2a, t1, t2) ∈ νEA

∧ (t′c, t2a, t1, t2) ∈ νEA}

and prove similar to the other lemmas M ⊆ SP(M). By de�nition of M , SP, SA,and EA we get for arbitrary (t′c, t

′a, t

′b) ∈M

aaci � aaai | aabi(4.101)(∀i)

(tci , tai , tbi) ∈ νSP

aa2ci� aa2

ai| t′bi

(4.102)

(t2ci, t2ai

, t′bi) ∈ νSP

aaai � aa1i | aa1ai

(4.103)

(tai , t1i , t1ai

) ∈ νSPaabi

� aa ′bi(4.104)

(tbi, t′bi

) ∈ SA

aa ′ai:= aa2

bi↓ aa1i aa2i (4.105)

(t′ai, t2bi

, t1i , t2i) ∈ νEAaa ′ci

:= aa2ci↓ aa1i aa2i (4.106)

(t′ci, t2ci

, t1i , t2i) ∈ νEA

The unlabeled lines yield with de�nition of M

(t′ci, t′ai

, t′bi) ∈M (4.107)(∀i)

74

Page 87: Java(X) A Type-Based Program Analysis Framework

4.9 Joining Variables

To prove aa ′ci� aa ′ai

| aa ′biwe make a case distinction for arbitrary i over aa1i that

decides which rule is used for the e�ect application.Case distinction over aa1i .

• Case aa1i = M(va1i): By EAActive we obtain aa2i = aa ′ai= aa ′ci

6= ♦. Inaddition we have via (4.103) aaai = M(va1i), with (4.101) aabi

= O and thereforeby (4.104) aa ′bi

= O. Then aa ′ci� aa ′ai

| aa ′biholds as ♦ is excluded.

• Case aa1i = O or aa1i = ♦: By the e�ect application (4.106) we get t′a = t2a andt′c = t2c . This directly yields aa ′ci

� aa ′ai| aa ′bi

with (4.101) and (4.102).

End case distinction over aa1i .That is, we have

aa ′ci� aa ′ai

| aa ′bi(∀i)

which yields with the de�nition of SP and (4.107) (t′c, t′a, t

′b) ∈ SP(M) and therefore

M ⊆ SP(M).

Finally, we are able to state and prove the theorem to join variables. The ideabehind this theorem is, that variables that are split o� by a variable access, getrejoined after evaluating the expression we split them o� for. The type system usesthe join-expression to remember that variables get rejoined after the evaluation ofan expression. Still, the join itself states this property and is not able to provideit itself. Therefore, we restrict the expression in this theorem to be join-free.

Theorem 4.36 (Join Variables). Let P ;A, a : ta, b : tb e e : t � A′, a : t′a, b : t′band tc � ta | tb and join-free(e). Then P ;A, c : tc e [c/b][c/a]e : t � A′, c : t′c with

t′c � t′a | t′b.

Proof. We prove this lemma by induction over the type judgment.Case distinction over P ;A, a : ta, b : tb e e : t � A′, a : t′a, b : t′b.

• Case TVar: We have P ;A, a : ta, b : tb e v : t � A′, a : t′a, b : t′b. For b 6= v 6= athe claim is trivial as none of the used variables are touched by the lemma and wehave tb = t′b, ta = t′a and therefore the claim holds with t′c = tc. For v = a (andanalog v = b) the claim follows via ta � t | t′a with Lemma 4.33.

• Case TNull: Trivial.

• Case TNew: By induction hypothesis.

• Case TAcc: For this case we have

P ;A, a : ta, b : tb, y : ty e letx = y.f in e′ : t � A′, a : t′a, b : t′b, y : t′′′y

Whenever b 6= y 6= a, the claim holds directly with the induction hypothesis. Fory = a or y = b, we have to show that P ;A, c : tc e letx′ = c.f in [c/b][c/a]e′ :

75

Page 88: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

t � A′, c : t′c holds. Without loss of generality, we choose y = a, y = b is analog.By TAcc we get ty = t′y |f tx, which yields with Lemma 4.17 ty � t′y | t+x witht+x = t+ |f tx. Further, with tc � tb | ty and Lemma 4.13 we obtain tc � t′′c | t+xwith t′′c � ta | t′y and tc = t′′c |f tx. then, by the induction hypothesis, we getP ;A, c : t′′c , x : tx e e′ : t � A′, c : t′′′c , x : t′x with t′′′c � t′a | t′′y . Applying theabove steps reverse (which we may do with Lemma 4.17 and Lemma 4.13), we gett′c = t′′′c |f t′x and t′c � t′a | t′′′y . Therefore, the claim holds.

• Case TSet: As �eld update expects a fully active or fully semiactive type, correctjoining of this type with another variable does not add any capabilities. Therefore,the claim holds by induction.

• Case TCall: That is, we have P ;A, a : ta, b : tb e v1.m(vii∈{2,...n}) : t � A′, a :

t′a, b : t′b. To prove the claim for TCall, we need an induction over the amount ofarguments of the method call. To avoid a tedious induction, we boil it down to thefollowing cases that describe the core issues of the proof.

We exemplary distinguish the cases for the equivalence of the used variables.

Case distinction over occurrences of a, b among vi.

� Case (∀i) a 6= vi 6= b: Then, the new introduced variable c is not used and theclaim holds trivially.

� Case a = vj , b = vk: We �rst consider the case where vj 6= vk and all parametersand the caller in TCall are disjoint. Let tj t′j and tk t′k be the correspondinge�ect application from mtype for a = vj and b = vk. Then we have with TCall

tb � tk | t′′′ (4.108)

ta � tj | t′′ (4.109)

t′b := tb ↓ tk t′k (4.110)

t′a := ta ↓ tj t′j (4.111)

In addition, the prerequisite of the lemma provides

tc � tb | ta (4.112)

Then, with (4.112) and (4.108) and splitting associativity, Lemma 4.13, we get

tc � tk | t′ (4.113)

Now, with (4.113), (4.112), (4.108), and (4.110) we apply Lemma 4.34 and obtain

t′′c := tc ↓ tk t′k (4.114)

t′′c � t′b | ta (4.115)

76

Page 89: Java(X) A Type-Based Program Analysis Framework

4.9 Joining Variables

We use splitting associativity on (4.115) and (4.109) to derive

t′′c � tj | t (4.116)

Again, this allows us to apply Lemma 4.34 on (4.116), (4.115), (4.109), and (4.111).Therefore, we have

t′c := t′′c ↓ tj t′j (4.117)

t′c � t′a | t′b (4.118)

With the two e�ect applications (4.114) and (4.117), the two splitting proper-ties (4.113) and (4.116), and TCall we obtain that

P ;A, c : tc e [c/b][c/a]v1.m(vii∈{2,...n}) : t � A′, c : t′c

with (4.118), t′c � t′a | t′b, holds.We omit a closer analysis of the case for a = vj = vk = b as this means that viatc � tb | ta there are only O and ♦ annotations. Therefore, this case is trivial as nocapabilities are added with the intended join operation.In addition, whenever we have another (or even several) equivalent variables vi = vj ,we simply have to apply the above method several times. For another order in theuse of the variables in the e�ect applications, we just have to adjust the order inthe proof.

� Case (∀i) a 6= vi, b = vj : This case is similar to the above one. By TCall andtherefore typing of b we have

tb � tj | t′′

Inversion of the e�ect application for vj yields for this case

t′b := tb ↓ tj t′j

In addition, by the induction hypothesis we get

tc � tj | t′

Finally, by de�nition of tc

tc � tb | ta

This enables us to apply Lemma 4.34 and therefore the claim holds as above. Again,whenever several vi are equivalent among each other and equivalent to b, we obtainthe claim by multiple application of the same approach.

77

Page 90: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

� Case (∀i) b 6= vi, a = vj : Analog to the previous case, simply exchange b and a.

End case distinction over occurrences of a, b among vi.

• Case TLetExp: As the let-bound variable has to be fresh the claim holds bydouble application of the induction hypothesis.

• Case TLetVar: Whenever the variables are distinct, the claims holds triviallywith the induction hypothesis. Again the interesting case is, when the variableused in the let-expression is equal to one of the joined variables. Let this bew.l.o.g. be a (b is again analogous). Therefore, we consider the case where we haveP ;A e letx = a in e′ : t � A′. By TLetVar we get

P ;A e a : t1 � A1 (4.119)

P ;A1, x : t1 e e′ : t � A2, x : t2 (4.120)

A A′ := A2 ↓ a : t1 t2 (4.121)

We name the types and variables as follows: A(a) = ta, A1(a) = t1a, A2(a) = t2a andA′(a) = t′a, analog for x. In addition we use c as the combined variable as statedin the lemma. By (4.119) we have

ta � t1 | t1a (4.122)

Let tc � ta | tb be the joined type. Then we get by the induction hypothesis appliedto (4.119) (as b is not touched)

tc � t1 | t1ct1c � t1a | tb (4.123)

With (4.123) and (4.120), as A1(b) = tb, too, we again get by the induction hy-pothesis joined typing for (4.120) and t2c with

t2c � t2a | t′b (4.124)

Now, with inversion of (4.121), we have t′a := t2a ↓ t1 t2. In addition, withTheorem 4.32 applied to (4.120) we obtain tb � t′b, which yields with (4.124),tc � ta | tb, (4.122), and Lemma 4.35

t′c := t2c ↓ t1 t2

t′c � t3a | t′b

This allows us to type

P ;A, c : tc e [c/a][c/b]letx = a in e′ : t � A′, c : t′c

Therefore, the claim holds.

78

Page 91: Java(X) A Type-Based Program Analysis Framework

4.10 Additional Lemmas

• Case TCond: The claim follows by the induction hypothesis.

• Case TAccNull: By the induction hypothesis.

• Case TJoin and TJoinNull: As e is supposed to be join-free, these cases areexcluded.

• Case TSetNull: By the induction hypothesis.

End case distinction over P ;A, a : ta, b : tb e e : t � A′, a : t′a, b : t′b.This concludes the proof for this theorem.

4.10 Additional Lemmas

In the following we provide some more lemmas that we need for the preservationproof. Theses are mainly properties on the interaction of di�erent relations.

Lemma 4.37 (Split Droppable). Let t1 � t2 | t3. Then ρ(t2) ∧ ρ(t3) ⇔ ρ(t1).

Proof. We �rst prove ρ(t2) ∧ ρ(t3) ⇒ ρ(t1) by coinduction. Therefore we de�ne

M = {t1 | t2 ∈ νDP ∧ t3 ∈ νDP ∧ (t1, t2, t3) ∈ νSP}

and prove M ⊆ DP(M). If t1 ∈M then we have by de�nition of M , DP, and SP

ρ(aa2i) (4.125)(∀i)t2i ∈ νDP (4.126)

ρ(aa3i) (4.127)

t3i ∈ νDP (4.128)

aa1i � aa2i | aa3i (4.129)

(t1it2it3i) ∈ νSP (4.130)

To obtain ρ(aa1i) we make a case distinction for arbitrary i over aa2i and aa3i .Case distinction over aa2i and aa3i .

• Case aa2i = aa3i = O: Then rule SplitInact has to be used to split (4.129).Therefore, we have aa1i = O, which is droppable.

• Case aa2i = aa3i = ♦: Similar to above, by inversion we use rule SplitSemiactand obtain aa1i = ♦, which again is droppable.

• Case aa2i = M(va2i), aa3i = O: Then, to satisfy (4.129) we have to use ruleSplitActFstand hence aa1i = aa2i , which is by (4.125) droppable.

• Case aa2i = O, aa3i = M(va3i): Symmetric to the above case with rule SplitAct-Snd.

79

Page 92: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

• Case other assignment: Yields a contradiction to (4.129) as no splitting rule existsfor these cases.

End case distinction over aa2i and aa3i .This yields

ρ(aa1i) (4.131)(∀i)

In addition, with (4.126), (4.128), (4.130) and the de�nition of M we obtain

t1i ∈M (4.132)(∀i)

Finally, by (4.126), (4.126) and the de�nition of DP we get t1 ∈ DP(M) and henceM ⊆ DP(M).The prove for ρ(t2) ∧ ρ(t3) ⇐ ρ(t1) is completely analogous.

Lemma 4.38 (Splitting Nonactive). Let t1 � t2 | t3. Then t1 = t2 ⇔ O/♦ t t3.

Proof. By coinduction. We �rst show t1 = t2 ⇒ O/♦ t t3 and de�ne

M = {t3 | (t1, t2, t3) ∈ νSP ∧ t1 = t2}

and prove that M is NA-consistent, that is M ⊆ NA(M). Let t3 ∈ M , then byde�nition of M and splitting

aa1i � aa2i | aa3i (4.133)(∀i)(t1i , t2i , t3i) ∈ νSP (4.134)

aa1i = aa2i (4.135)

t1i = t2i (4.136)

Case distinction over aa1i (= aa2i).

• Case O: Then we have to use rule SplitInact to satisfy (4.133). That is, we haveaa3i = O.

• Case ♦: Similar to above we obtain aa3i = ♦.

• Case M(va1i): To satisfy (4.133), rule SplitActFsthas to be used, which yieldsaa3i = O.

End case distinction over aa1i (= aa2i).Therefore, we have

aa3i ∈ {O,♦} (4.137)(∀i)

80

Page 93: Java(X) A Type-Based Program Analysis Framework

4.10 Additional Lemmas

With (4.134), (4.136), and the de�nition of M we get

t3i ∈M(∀i)

which yields with (4.137) and the de�nition of NA that t3 ∈ NA(M), thereforeM ⊆ NA(M). The proof for t1 = t2 ⇐ O/♦ t t3 is analog.

Lemma 4.39 (Non-Active Implies Droppable). Let O/♦ t t. Then ρ(t).

Proof. Straightforward by coinduction. Nonactive only allows O and ♦ activityannotations, which are both droppable by de�nition.

Lemma 4.40 (Extend Environment with Droppable). Let P ;A e e : t � A′ andA′′ ρ� A. Then P ;A′′ e e : t � A′′′ with A′′′ ρ� A′.

Proof. By multiple application of Lemma 4.20 we know for all variables v ∈ dom(A′′)that there exists a type tvρ for which A

′′(v) � A(v) | tvρ and ρ(tvρ) holds. Using weak-ening (Lemma 4.18) we add a fresh variable vρ for every v to A and obtain Aρ suchthat for all variables Aρ(vρ) = tvρ and Aρ(v) = A(v) and

P ;Aρ e e : t � A′ρ

In addition, weakening yields that A′ρ(vρ) = Aρ(vρ) = tvρ and A′ρ(v) = A′(v) =t′v. Now we get by multiple application of Theorem 4.36 for every v and vρ withsubsequent renaming to v that A′′′(v) � t′v | tvρ. The claim then holds by applicationof Lemma 4.21.

Lemma 4.41 (Subdroppability is Subactive). Let t ρ� t′. Then t � t′

Proof. Straight forward by coinduction. We de�ne

M = {(t, t′) | (t, t′) ∈ νSD}

and prove M ⊆ SA(M). Let (t, t′) ∈M , then by de�nition of M and SD

aaiρ� aa ′i(∀i)

(ti, t′i) ∈ νSD (4.138)

Case distinction over Subdroppable for activity annotations directly yields (∀i) aai �aa ′i. With (4.138) and the de�nition of M we get (ti, t′i) ∈ M and �nally by thede�nition of SA we have (t, t′) ∈ SA(M), therefore M ⊆ SA(M) holds.

The following lemma points out the connection between splitting, or rather join-ing of variables, and the e�ect application.

81

Page 94: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Lemma 4.42 (Join Corresponds to E�ect Application).

tw � t1 | t′wLet

t∗w � t′′1 | t′′wt1 � t′′1

t′′′w := t′′w ↓ t1 t′′1

Then t∗w = t′′′w .

Proof. By coinduction. We de�ne

M = {(t∗w, t′′′w ) |(tw, t1, t′w) ∈ νSP∧ (t∗w, t

′′1, t

′′w) ∈ νSP

∧ (t1, t′′1) ∈ νSA∧ (t′′′w , t

′′w, t1, t

′′1) ∈ νEA}

and prove M ⊆ EQ(M). Let arbitrary (t∗w, t′′′w ) ∈M , then

aawi � aa1i | aa ′wi(4.139)(∀i)

(tw, t1, t′w) ∈ νSP (4.140)

aa∗wi� aa ′′1i

| aa ′′wi(4.141)

(t∗w, t′′1, t

′′w) ∈ νSP (4.142)

aa1i � aa ′′1i(4.143)

(t1, t′′1) ∈ νSA (4.144)

aa ′′′wi:= aa ′′wi

↓ aa1i aa ′′1i(4.145)

(t′′′w , t′′w, t1, t

′′1) ∈ νEA (4.146)

We make a case distinction for the inversion of (4.145)Case distinction over derivation of aa ′′′wi

:= aa ′′wi↓ aa1i aa ′′1i

.

• Case EAActive: Then we have aa ′′′wi= aa ′′1i

and aa1i = M(va1i). The later yieldswith (4.139) and SplitActFst aa ′wi

= O, further, with (4.143) we get t′′wi= O.

Then, no matter which rule is used for (4.141), aa∗wi= aa ′′1i

, which yields withaa ′′′wi

= aa ′′1ifrom the beginning aa∗wi

= aa ′′′wi.

• Case EANonactive: Then we have aa ′′′wi= aa ′′wi

and aa1i ∈ {O,♦}, which yields,no matter what rule is used for (4.143), aa ′′1i

. This yields, like above with (4.141)aa ′′wi

= aa∗wiand therefore again aa∗wi

= aa ′′′wi.

End case distinction over derivation of aa ′′′wi:= aa ′′wi

↓ aa1i aa ′′1i.

This yields

aa∗wi= aa ′′′wi

(4.147)(∀i)

82

Page 95: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

With (4.140), (4.142), (4.144), (4.146), and the de�nition of M we obtain

(t∗wi, t′′′wi

) ∈M (4.148)(∀i)

Now we combine (4.147) and (4.148) with the de�nition of EQ and get (t∗w, t′′′w ) ∈

EQ(M). Therefore, M ⊆ EQ(M) holds.

Lemma 4.43 (Serialize E�ect Application). Let A′ := A ↓ vi : ti t′i. Then for

A1 = A

A2 := A1 ↓ v1 : t1 t′1...

An+1 := An ↓ vn : tn t′n

with An+1 = A′.

Proof. This lemma is a direct consequence of the e�ect application de�nition inFigure 3.11.E�ect application only a�ects one variable at a time, therefore, by EAVar com-

bined with EANull, the lemma trivially holds for disjoint variables. In addition,for the same reason, we may rearrange disjoint variables. Switching the order of twoe�ect applications for the same variable is not allowed. That is, we may rearrangethe variables, such that for the case where multiple e�ect applications are appliedto one variable, we get for the lemma v1 = v2. The following simpli�ed derivationtree shows that the lemma also holds for multiple applications for one variable. Ifwe separate the derivation tree we would have A′(v) = t1 in the �rst step, and thenA′(v) = t2 all together. This is exactly the separation proposed by the lemma.

A′(v) = t2 t t2 := t1 ↓ t2 t′2

A′ := A, v : t1 ↓ v : t2 t′2t t

1 := t0 ↓ t1 t′1

A′ := A, v : t0 ↓ v : t1 t′1, v : t2 t′2

The same holds for further multiple e�ect applications to the same variable, too.

4.11 Preservation

To state the preservation theorem, we need some more prerequisites. First, we intro-duce access paths, then we provide an Urtype assumption to show that the activityannotations are used correctly throughout Java(X). Finally, we extend typing to acon�guration with store to provide the main invariant for our preservation theorem.

83

Page 96: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

The store steadiness under environment A ensures that any change in the storedoes not illegally change references. That is, an updated �eld has to be fully activeor fully semiactive in the environment and the updated �eld must be droppable inthis environment.

De�nition 4.44 (Store Steadiness).

∀l ∈ S1 : S1(l).fi 6= S2(l).fi ⇒(ρ(A(l).fi) ∧

(M s A(l)↓sfi

∨ ♦ s A(l)↓sfi

))S1 �A S2

Where t↓sfjdenotes the projection of sj out of the annotated type t = 〈va, c{fi : si}〉.

De�nition 4.45 (Access Path). We denote access paths with ⇀fi. That is, x.⇀fi ≡x.f1.f2 . . . fn. We use the notation f ⊕ ⇀f ′i to attach f to the front of a path ⇀f ′i . In

addition, we provide a notation for access paths on types and stores:

t.ε = tt = 〈va, c{fi : si}〉 sj = 〈aa, tj〉

t.(fj ⊕ fji) = tj .fji

S(w).ε = wS(l) = 〈c, `, fi 7→ wi〉

S(l).(fj ⊕ fji) = S(wj).fji

Next, we introduce an Urtype assumption

A ::= ∅ | A, l : t

This assumption assigns to each location/object its annotated type before it isused in the program. Every annotated type of a use of a particular location inthe program must be split o� from the Urtype for the location. The purpose ofthis Urtype assumption is to ensure that activity annotations are only used in theintended way, that is, there exists at most one active reference for a location andthere is no active reference whenever there exists a semiactive one for this location,too. The Urtype assumption changes during evaluation to re�ect changes of the �eldvalues of an object with active �elds. We additionally use the Urtype assumptionto show that no active references appear from nowhere. The only possibility tointroduce an active reference is via the new-expression. From then on we may passthe activity annotation but never duplicate an active activity annotation. Thatis, the most important point about the Urtype assumption is that it guaranteesconsistent use and distribution of the activity annotations throughout the uses ofthe locations in the program.The following aliases-ok predicate, which uses the Urtype assumption, states the

important property of the uniqueness of active references throughout the environ-ment:

aliases-ok(l,A, A,S)⇔ A(l),S(l) - {|A(li).fi | S(li).fi = l|}

84

Page 97: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

It relates all type assumptions about a single location l with an Urtype assumptionA. Every active annotation in the typing must be sanctioned by an active annota-tion in the Urtype assumption. The Urtype assumption for a location is responsible(1) for the local activity annotation of �elds that refer to de�ned locations and (2)for the full type of �elds that contain null. The de�nition collects relevant typesin a multiset because each occurrence of a type contributes to the activity. Thus,the aliases-ok predicate ensures that there is at most one active annotation in alltype assumptions about l.

It remains to de�ne the sanctions relation between an entry in an Urtype as-sumption (an annotated type), an entry in a store, and a multiset of annotatedtypes. Its �rst stage projects out, for each �eld, the corresponding �eld type, thestored value, and the multiset of �eld types.

(∀i) si, wi - {|sιi | ι ∈ J |}

〈va, c{fi : si}〉, 〈c, `, fi 7→ wi〉 - {|〈vaι, c{fi : sιi}〉 | ι ∈ J |}

Its second stage states that the annotation from the Urtype assumption splits intothe multiset of the activity annotations with the obvious semantics for this multiplesplitting. For each null value, the multiset of types is also split o� from the typein the Urtype assumption.

aa � {|aaι | ι ∈ J |} w = null⇒ t � {|tι | ι ∈ J |}〈aa, t〉, w - {|〈aaι, tι〉 | ι ∈ J |}

It is important to realize that this relation only checks the �rst level of a type anddoes not recursively descend. This su�ces as all values are checked separately, i.e.,aliases-ok is applied to all locations in dom(S).The typing judgment P ;A;A c 〈e;L;S〉 : t � A′ for con�gurations 〈e;S〉 in

context L (a multiset of locations) formalizes the main invariant of the preservationlemma.

The context L contains the references that are cleaned up in the context sur-rounding the expression. Therefore this inner expression may drop these referenceseven if they are not droppable, as the context is responsible to take care that thisreference is not completely lost and therefore switches inside the context from anundroppable state to a droppable one.

The extended typing judgment holds if

• the store is consistently typed,

• the expression is well-typed,

• the program is well-formed,

85

Page 98: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

• the locations occurring in the expression are all de�ned in the store,

• every location which is not reachable from L and the locations in the expres-sion must have a droppable type, and

• the locations in L are all typed and use enough capabilities of the �nal as-sumptions A′ so that all types in the remaining assumption A′′ are droppable.

P ;A S S : A P ;A e e : t � A′ ` P A A � A′fv(e) ⊆ dom(S) L ⊆ dom(A′) P ;A′ e L : t � A′′ ρ(A′′)

P ;A;A c 〈e;L;S〉 : t � A′

dom(S) ⊆ dom(A) dom(A) = dom(A) (∀l ∈ dom(S)) P ;A;A;S l l : A(l)P ;A S S : A

S(l) = 〈c, `,F〉 Rnew`,c (va) ran(F) ⊆ dom(S) ∪ {null}

(∀i) F(fi) = null⇒ P null si (∀i) P s wf (si) aliases-ok(l,A, A,S)

P ;A;A;S l l : 〈va, c{fi : si}〉The judgment S , which states the consistency of the assumptions about the store,has a standard inductive reading despite the potential presence of cyclic structuresin the store. All potentially cyclic references are broken by the explicit use of thetype environment A.The preservation lemma uses an extension relation w for Urtype assumptions,

which holds between successive con�gurations. It basically states that active capa-bilities cannot be created from nothing.

domA1 ⊆ domA2 (∀l ∈ domA1) A1(l) w A2(l)A1 w A2

(∀i) si w s′i

〈va, c{fi : si}〉 w 〈va ′, c{fi : s′i}〉(aa = M(va) ∨ aa = aa ′) t w t′

〈aa, t〉 w 〈aa ′, t′〉

The type preservation lemma states that reducing an expression does not change itstype. In addition, we ensure via w that no activities are randomly generated duringexecution. The aliases-ok predicate guarantees the correct distribution of activities.Further, we ensure that the environment does (a) not loose any variables and,more important, (b) a reduction step does not yield that the outgoing environmentchanges any droppable reference to undroppable.Preservation also shows that Java(X) does not violate the stated droppability

properties. To this end, we use several constraints in the invariant. The aboveintroduced context L, together with P ;A′ e L : t � A′′ and ρ(A′′) ensures thatwe are not left with undroppable references, when we handle the according expres-sion. In addition we need to ensure that during the reduction, we do not overwrite

86

Page 99: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

references that are not droppable, the steadiness of the store takes care of thisproperty.We omit JF in the invariant, as we already proved Lemma 4.6, which states that

JF in fact is an invariant under the reduction relation.

Theorem 4.46 (Preservation). Suppose P ;A1;A1 c 〈e1;L;S1〉 : t � A′1 and P `〈e1;S1〉 ↪→ 〈e2;S2〉 and JF(e1). Then there exist A2, A2, and A

′2 with

A1 w A2

dom(A′1) ⊆ dom(A′2)

A A′2�dom(A′1)

ρ� A′1S1 �A1 S2

such that P ;A2;A2 c 〈e2;L;S2〉 : t � A′2.

Proof. We prove preservation by induction on the de�nition of ↪→.For this proof we use some conventions. We use the inversion lemma 4.5 through-

out the proof without further explicit mentioning. In addition, to avoid clashes forthe variable names we consider e to be completely alpha-renamed. To increasereadability and avoid cluttering P , A1, A2, A1, A

′1, A

′2, A2, S1, S2, and t always

denote the program, Urtype assumption, environment, store, and type as speci�edin the preservation lemma.Case distinction over ↪→.

• Case RNew: e1 ≡ new `c(wi)e2 ≡ lS2 = S1, l 7→ 〈c, `, fi 7→ wi〉

fieldsP (c) = cifi

By the typing rule TNew for e1 we have

t = 〈va, c{fi : 〈aai, ti〉}〉Rnew

`,c (va)

P ;A1 e wi : ti � A′1 (4.149)

We set A2 = A′1, l : t and A′2 = A′1, l : t′ such that t � t | t′ holds. Thenwe have by TVar P ;A2 e l : t � A′2. In addition, dom(A′1) ⊆ dom(A′2) and

A A′2�dom(A′1) = A′1 hold.

Droppability for A′2 via P ;A′2 e L : t′′ � A′′2 holds, as by t � t | t′ with Lemma 4.38and Lemma 4.39 we get ρ(t′). With unchanged L and A′2�dom(A′

1) = A′1 all othervariables are �ne, too. The store steadiness holds, too, as we only add a locationand change none of the existing ones.

87

Page 100: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

It remains to show that the Urtype assumption and aliases-ok ful�ll all neededconditions. We set A2 = A1, l 7→ t. Then A1 w A2 holds, as we only add onenew location. First, aliases-ok(l,A2, A2,S2) holds, as this location is new andtherefore there may not exist any other references to this location. Now we take acloser look at aliases-ok(wi,A2, A2,S2), for arbitrary i. We have A2(l).fi = ti andA2(wi) = A′1(wi), which yields via (4.149) and Lemma 4.31 A1(wi) � ti | A2(wi).In addition, we know that aliases-ok(wi,A1, A1,S1) holds.First, we consider the cases where the di�erent wi are completely disjoint andcontain no reference to each other. Let {|A1(wi), tι | ι ∈ J |} be the multiset of typesfor references to location wi for aliases-ok(wi,A1, A1,S1). Thereby tι represents thetypes of potential other access paths. We only add one reference for wi, the newintroduced location l contains this reference. Therefore, the corresponding multisetfor aliases-ok(wi,A2, A2,S2) is {|A2(l).fi, A2(wi), tι | ι ∈ J |} with unchanged typestι. Now, with A1(wi) � ti | A2(wi) we obtain that the sum of activities did notchange and aliases-ok(wi,A2, A2,S2) holds.Whenever the di�erent variables inside the new expression have references to eachother, the multiset further changes. Still, the same arguments as above hold. Weexemplary consider the case where (arbitrarily) w1.f = w2, several such cross-references, longer access paths or even recursive references do not change the ar-gumentation. For every cross-reference, the above multiset changes. In this case,w2 has an additional, known, access path, we have for aliases-ok(w2,A1, A1,S1)the multiset {|A1(w2), A1(w1).f, tι | ι ∈ J ′|}. Basically, A1(w1).f concretizes oneof the untouched types tι. Now, the multiset for aliases-ok(w2,A2, A2,S2) is{|A2(l).f2, A2(w2), A2(w1).f, tι | ι ∈ J ′|}. Still, via splitting A1(w2) � t2 | A2(w2),additionally A1(w1) � t1 | A2(w1) and therefore A1(w1).f � t1.f | A2(w1).f , weobtain that aliases-ok(w2,A2, A2,S2) holds.This su�ces for P ;A2;A2 c 〈e2;L;S2〉 : t � A′2. Therefore, the complete preser-vation invariant holds.

• Case RAcc: We have

e1 ≡ letx = l.fj in e

e2 ≡ joinw = l.fj from [w/x]eS1(l) = 〈c, `,F〉F(fj) = w

S2 = S1

The con�guration ensures via P ;A1 S S1 : A1 that dom(S) ⊆ dom(A). Therefore,we know from S1(l).fj = w and the consistent store property that w ∈ dom(A1).

88

Page 101: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

Let A1(w) = tw, then we have with TAcc

A1 = A, l : ty, w : tw (4.150)

A′1 = A′, l : t′′′y , w : t′′′wty = t′y |fj

tx (4.151)

t′′′y = t′′y |fjt′x (4.152)

P ;A, l : t′y, x : tx, w : tw e e : t � A′, l : t′′y, x : t′x, w : t′′′w (4.153)

By TypeAccess and (4.151) we derive that ty.fj � t′y.fj | tx. Therefore, byLemma 4.24, ty.fj � tx. With aliases-ok(w,A1, A1,S1) ⇔ {|ty.fj , tw, t

ι | ι ∈ J |}and the de�nition of aliases-ok we get some t0 such that

t0 � ty.fj | tx

Now, as ty.fj � tx, there exists a t′w, such that

t′w � tw | tx (4.154)

As join-expressions are only intermediate expressions we provided an additionalinvariant, JF(e1), for the preservation lemma. Therefore, we are guaranteed thatjoin-free(e) holds and may apply, with (4.153) and (4.154), Theorem 4.36 to join wand x. We get, once again with renaming of the new variable to w

P ;A,w : t′w, l : t′y e [w/x]e : t � A′, w : t′′w, l : t′′y (4.155)

with

t′′w � t′′′w | t′x (4.156)

By double application of Theorem 4.32 with (4.153) we get

tx � t′x (4.157)

tw � t′′′w (4.158)

Then by (4.151), (4.152), (4.154)-(4.158), and TJoin we derive

P ;A, l : ty, w : tw e joinw = l.fj from [w/x]e : t � A′, l : t′′′y , w : t′′′w

With A2 = A1, A2 = A1, A′2 = A′1, and S2 = S1, all other properties hold trivially.

• Case RJoinLoc: e1 ≡ joinw = l.f fromw′

e2 ≡ w′

S2 = S1

89

Page 102: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

by inversion of e1 with TJoin we have

A1 = A, l : ty, w : tw (4.159)

A′1 = A′, l : t′′′y , w : t′′′wty = t′y |fj

tx (4.160)

t′′′y = t′′y |fjt′x (4.161)

t′w � tw | tx (4.162)

t′′w � t′′′w | t′x (4.163)

tx � t′xtw � t′′′w

P ;A,w : t′w, l : t′y e w′ : t � A′, w : t′′w, l : t′′y (4.164)

To understand this part of the proof it is important to realize that tx hands oversome capabilities from ty, where it is split of, to t′w. Similarly, t′x reverses thishand-over of capabilities after typing the inside expression, here w′. Here, we justneed to take care of choosing the right tx to satisfy (4.160) through (4.164).

To prove preservation for RJoinLoc we perform a case distinction over the equiv-alence of w′, w and l.

Case distinction.

� Case w′ 6= l 6= w 6= w′: By Lemma 4.16 and (4.164) we get t′y = t′′y and t′w = t′′w.As l 6= w′ 6= w the type judgment (4.164) types for any well-formed t′y and t′w.Therefore, we may select tx (t′x) to ful�ll O/♦ t tx(t

′x). Then, with (4.160)-(4.163)

and Lemma 4.38 and Lemma 4.17, we get ty = t′y, tw = t′w, t′′y = t′′′y , and t

′′w = t′′′w .

Then we have with (4.164)

P ;A,w : tw, l : ty e w′ : t � A′, w : t′′′w , l : t′′′y

We may set A2 = A1, A2 = A1, and A′2 = A′1, which then trivially ful�lls thecomplete preservation invariant.

� Case w′ = l 6= w: By Lemma 4.16 we get t′w = t′′w and by Lemma 4.31 t′y � t | t′′y .As tx may only remove capabilities from ty we may select, as above, tx (t′x) to haveno active capabilities and therefore O/♦ t tx (t′x). This leads, similar to the previouscase, to tw = t′w, t

′′w = t′′′w , ty = t′y, and t

′′y = t′′′y . Typing, Urtype assumption and

environment follow, including the desired properties, as above.

� Case w′ = w 6= l: In this case, we have by Lemma 4.31 t′w � t | t′′w. Again, similarto above, we get t′y = t′′y .

As the let-expressions are the only ones to replace a variable, we know that w′ waseither replaced to w while processing such a let or it has already been w in the user

90

Page 103: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

code. The tx represents the type of such an potentially replaced x by substitution[w′/x].Case distinction.

∗ Case w′ has already been w: Then tx = t′x an therefore ty = t′′′y . As the originalvariable x is not used we choose tx such that O/♦ t tx. Then we have ty = t′y =t′′y = t′′′y and tw = t′w and t′′w = t′′′w . Preservation with typing of w′ again followswith A2 = A1, A2 = A1, and A

′2 = A′1.

∗ Case Some x was replaced by w′/w: Then we have t′y = t′′y . We chose tx = t, andtherefore have O/♦ t t

′x by Lemma 4.38. This step represents that we only pass

over the needed capabilities, all further capabilities would get returned through t′xanyway and are here not needed. It follows t′′′y = t′′y and t′′w = t′′′w . We also havetw = t′′′w , as all capabilities for t are drawn from tx. In addition, we know fromA1 � A′1 and Lemma 4.16 that ty � t′′′y = t′′′y = t′′y = t′y and tw � t′′′w = t′′w andwith (4.164) t′w � t′′w. We set A2 = A,w : t′w, l : t′′y[= t′y], and A

′2 = A′1. The Urtype

remains unchanged A2 = A1. The aliases-ok relation still holds, as for w we onlymove capabilities from ty.fj to tw, the sum of the capabilities stays the same, detailsare similar to the RNew-case of this proof. For l, whose type really changed, wehave to keep in mind that the Urtype assumption only checks the activities for thetop-level �elds, which stay unchanged as only capabilities from ty.fj are passed overto t′w, which we just checked, is correct.

End case distinction.Both cases satisfy the conditions we need for preservation.

� Case w = l: Then we have t′w = t′y, t′′w = t′′y , t

′′′w = t′′′y , and tw = ty. There-

fore, (4.160) is equivalent to

tw = t′w |fjtx ⇒ tw � t′w (4.165)

From (4.162) we derive t′w � tw, that yields with (4.165) tw = t′w and thereforeO/♦ t tx. Similar we derive O/♦ t t

′x. Now we have, as for the �rst case, tw = t′w =

t′′w = t′′′w and ty = t′y = t′′y = t′′′y . We set A2 = A1, A2 = A1, and A′2 = A′1, the rest

is trivial. This holds weather w′ = w = l or w′ 6= w.

End case distinction.This case distinction covers all possibilities for w′, w and l, as all of them satisfythe complete invariant, RJoinLoc enjoys preservation.

• Case RSet: e1 ≡ set l.fj = w in e

e2 ≡ eS1 = S, l 7→ 〈c, `,F〉S2 = S, l 7→ 〈c, `,F [fj 7→ w]〉

91

Page 104: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

First, by e1 we know that l and w are both in the domain of the environment(or w = null, which allows any type for w) and therefore, by the consistent storeproperty, in the store and Urtype assumption, too. Let

A1(l) = 〈va, c{fi : s0i }〉

With TSet for e1 we know

P ;A1 e w : t′ � A′, l : 〈va, c{fi : si}〉 (4.166)

c{fi : si} ` fj ← t′ � c{fi : si; fj : s′j} (4.167)

P ;A′, l : 〈va, c{fi : si; fj : s′j}〉 e e : t � A′1 (4.168)

With (4.167) and the de�nition of type updates we directly get S1 �A1 S2. There-fore, no active reference that is not droppable is lost nor was any non-active referencechanged to an active one. With

A2 = A′, l : 〈va, c{fi : si; fj : s′j}〉A′2 = A′1

we obtain dom(A′1) ⊆ dom(A′2), and A A′2�dom(A′1) = A′1, and together with (4.168)

P ;A2 e e2 : t � A′2. Now, RSet is one of the few occasions, where the system mayreally drop capabilities. An active �eld that is overwritten naturally is, via thisreference, not accessible anymore. Therefore, the capabilities of this �eld are lost.That is the reason why we �rst ensure that the �eld is droppable. This also a�ectsthe Urtype assumption: We correctly took care of the initially generated activityannotations, still, from now on, they are not accessible in the system anymore. Theremaining part of the preservation proof for RSet takes care of this change in thereachable activities.

Whenever the update occurs for a semiactive �eld, TypeUpdateSemiact ensuresthat all types are unchanged, in that case we may safely de�ne A1 = A2 and aredone. Now, let's take a look at the second and interesting case, where an active�eld is updated. First, the �eld fj inside is updated and even if TypeUpdateActensures that this �eld stays active, the value annotation inside this activity anno-tation changes. Therefore, we have to update the Urtype assumption for l. As theother �elds stay untouched, we �rst extract their type from A1. Let

A1(l) = 〈va, c{fi : sAi }〉

then we de�ne

A2(l) = 〈va, c{fi : sAi ; fj : s′j}〉

92

Page 105: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

This is safe, as M s sj ensures that all activity annotations are active and viaaliases-ok(l,A1, A1,S1) we know that there is at most one active reference, whichtherefore is the one for sj . This also directly yields that aliases-ok(l,A2, A2,S2)holds.Now, we still have to check the �elds inside sj . Via M s sj we know that any�eld inside sj is active, too. But, next to this set we do not have this access pathanymore, that is, we drop the active annotation. Let

S1(l).fj = w0

Then, again via M s sj and aliases-ok(w0,A1, A1,S1), we know that A1(w0) carriesonly active capabilities, too. Still, from know on, these activities are not existentanymore, as there is no more active reference. Therefore, assuming w0 6= null,we have to remove all capabilities from the Urtype assumption for the location w0,which we will do using splitting and Lemma 4.38

A1(w0) � A1(w0) | tO

A2(w0) = tO

With this de�nition, we again directly get aliases-ok(w0,A2, A2,S2). Finally, wesimilarly have to adjust the Urtype assumption recursively for all subsequent �eldsof w0 and so on.With this de�nition of A2, A2, and A

′2 the store consistency, P ;A2 S S2 : A2 holds.

As A′2 = A′1 we also have P ;A′2 e L : t � A′′2 with ρ(A′′2). That is, all propertiesfor preservation hold.

• Case RCall: e1 ≡ l.m(wi) i ∈ {2..n}e2 ≡ let this, xi = l, wi in e

S1(l) = 〈c, `,F〉mbodyP (c,m) = xi{e}

S2 = S1

To simplify the notation, we �rst denote l = w1 and this = x1. Then, we get withTCall and A1 = A0

mtypeP (t1,m) = ti t′ii∈{1..n} → t

P ;A(i−1)e wi : ti � Ai

i∈{1..n}(Ti)

A′1 := A1 ↓ wi : ti t′ii∈{1..n}

(4.169)

The typing of the method yields

P ;xi : ti e e : t � xi : t′i (4.170)

93

Page 106: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

With Lemma 4.43 and (4.169) we get for A1 = A′′1 and A′1 = A′′n+1

A′′2 := A′′

1 ↓ v1 : t1 t′1 (EA1)

...

A′′n+1 := A′′

n ↓ vn : tn t′n (EAn)

We leave the environment and Urtype assumption unchanged, that is A2 = A1,A2 = A1, and A

′2 = A′1. To type e2 we want to show that

P ;A1 e letxi = wi in e : t � A′1 (4.171)

holds. We have to apply rule TLetVar and have to show

P ;A0e w1 : t1 � A1

P ;A1, x1 : t1 e letxi = wi ini∈{2..n}

e : t � A′′1, x0 : t′1 (LET1)

A′′2 := A′′1 ↓ w1 : t1 t′1

The unlabeled equations correspond to (T1) and (EA1), by further unrolling of(LET1) we have to show generally for 1 ≤ k ≤ n, i ∈ {1..n}

P ;Ak−1e wk : tk � Ak

P ;Ak, xi : tii≤k

e letxi = wi ini>k

e : t � A′′k, xi : t′ii≤k

(LETk)

A′′k+1 := A′′k ↓ wk : tk t′k

The unlabeled equations always correspond to (Tk) and (EAk). For (LETk) wetake a closer look at (LETn). Here, we have no more let expressions:

P ;An, xi : tii≤n

e e : t � A′′n, xi : t′ii≤n

(LETn)

We obtain (LETn) by (4.170) and Lemma 4.18. Now, with (LETn), we obtain(LETn−1) and so on. Therefore, (4.171) holds. With A2 = A1, A2 = A1, andA′2 = A′1 the complete preservation invariant is valid.

• Case RLetLoc: e1 ≡ letx = w in e

e2 ≡ [w/x]eS1 = S2

By Lemma 4.5 the expression e1 is either typed by TLetExp or TLetVar.Case distinction typing of e1.

94

Page 107: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

� Case TLetExp: By this typing rule we get

A1 = A,w : twP ;A,w : tw e w : t1 � A,w : t′w

⇒ tw � t1 | t′w (4.172)

P ;A,w : t′w, x : t1 e e : t � A′, w : t′′w, x : t′′1 (4.173)

ρ(t′′1) (4.174)

A′1 = A′, w : t′′w

With A1(w) = tw, (4.172), (4.173), and the invariant JF(e1) we apply Theorem 4.36to join w and x. With basic renaming, we get

P,A1 e [w/x]e : t � A′, w : txwtxw � t′′w | t′′1 (4.175)

That is, we may type e2 with the correct type t. With

A2 = A1

A2 = A1

A′2 = A′, w : txw

we get A1 w A2, dom(A′1) ⊆ dom(A′2) and P ;A2 s S2 : A2 including aliases-ok . Asρ(t′′1), (4.174), holds, we only add droppable parts to txw, therefore by Lemma 4.21we get A A′2�dom(A′

1)ρ� A′1 holds, too. It remains to show that the droppability

invariant, P ;A′2 e L : t � A′′2 with ρ(A′′2), holds. It su�ces to show this propertyfor A′2(w) = txw as no other variable is changed in the environment. We distinguishthe cases where w ∈ L or w /∈ L. If w ∈ L we realize by (4.174), (4.175), andA′1(w) = t′′w that there are no undroppable parts added to txw. The invariant stillholds by Lemma 4.21 and Lemma 4.14. If w /∈ L we know from P ;A′1 e L : t � A′′

and ρ(A′′) that t′′w is droppable. If ρ(t′′w) we know by (4.174) and txw � t′′w | t′′1 withLemma 4.37 that txw is droppable, too.

� Case TLetVar: By this typing rule we get the same assertions as above, except forthe last two lines, where we have the e�ect application and in consequence another

95

Page 108: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

type for w in the resulting environment.

A1 = A,w : twP ;A,w : tw e w : t1 � A,w : t′w

⇒ tw � t1 | t′w (4.176)

P ;A,w : t′w, x : t1 e e : t � A′, w : t′′w, x : t′′1 (4.177)

A A′1 := A′, w : t′′w ↓ w : t1 t′′1 (4.178)

A′1 = A′, w : t′′′w

First, we get analogous to the above case with Theorem 4.36

P,A1 e [w/x]e : t � A′, w : txwtxw � t′′w | t′′1 (4.179)

and de�ne in the same way

A2 = A1

A2 = A1

A′2 = A′, w : txw

Which yields as above most of the invariant. Again we take a further look at thedroppability. That is, we relate txw and t′′′w . With Theorem 4.32 and (4.177) weget t′w � t′′w and t1 � t′′1. This enables us, with (4.179), (4.178), and (4.176),to apply Lemma 4.42. Therefore, we have txw = t′′′w . On the �rst sight this mayseem un-intuitive, basically it shows that the de�nition of the e�ect application iscorrect. That is, the e�ect application neither loses nor adds any capabilities. Inconsequence we get A′2 = A′1, therefore the droppability constraints trivially hold.

End case distinction typing of e1.Both cases prove the complete invariant, therefore RLetLoc enjoys preservation.

• Case RCondLoc: e1 ≡ if l then e′1 else e′2

e2 ≡ e′1S2 = S1

Let A1 = A, l : tl, then we know by TCond

P ;A1 e l : t′′l � A, l : t′′′lP ;A1 e e

′1 : t � A′, l : t′l

As A′1 = A′, l : t′l, the claim holds with A2 = A1, A2 = A1, and A′2 = A′1.

96

Page 109: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

• Case RCondNull: Analog to RCondLoc.

• Case RLetExp: e1 ≡ letx = e in e′′

e2 ≡ letx = e′ in e′′

By the con�guration judgment we obtain

P ;A′1 e L : t � A′′1 (4.180)

ρ(A′′1) (4.181)

As we here have e 6= v, inversion yields TLetExp as single possibility. That is, weget

P ;A1 e e : t1 � Ax1

P ;Ax1 , x : t1 e e

′′ : t � A′1, x : t′1 (4.182)

ρ(t′1) (4.183)

By (4.180)-(4.183) we obtain that every l /∈ L that is not droppable in Ax1 , which

means ¬ρ(Ax1(l)), is cleaned up properly in e′′. Let L1 = L ∪ Lx with Lx :=

{l|¬ρ(Ax1(l)) ∧ l /∈ L}, then trivially

P ;Ax1 e L : tx � Ax

ρ(Ax)

Then we obtain, as all other parts of the invariant hold by the precondition

P ;A1;A1 c 〈e;L1;S1〉 : t1 � Ax1

The property JF(e) propagates from JF(e1) and the precondition for RLetExpyields that there exists a reduction P ` 〈e;S1〉 ↪→ 〈e′;S2〉. Therefore, we may applythe induction hypothesis and get

A1 w A2 (4.184)

dom(Ax1) ⊆ dom(Ax

2)

A Ax2�dom(Ax

1 )ρ� Ax

1 (4.185)

S1 �A2 S2 (4.186)

P ;A2 S S2 : A2 (4.187)

and

P ;A2;A2 c 〈e′;L1;S2〉 : t1 � Ax2 (4.188)

97

Page 110: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

wherefrom we obtain

P ;Ax2 e L1 : t′ � A′′2 (4.189)

ρ(A′′2)P ;A2 e e

′ : t1 � Ax2 (4.190)

With the above equations we already de�ned A2, A2, and S2 for the preservationof RLetExp.

As we only consider expressions with unique variable names, with (4.185), (4.182),Lemma 4.18, and Lemma 4.40, we get

P ;Ax2 , x : t1 e e

′′ : t � A′2, x : t′1 (4.191)

with

dom(A′1) ⊆ dom(A′

2) (4.192)

A A′2�dom(A′

1)ρ� A′

1 (4.193)

That is, we de�ned the remaining environment, A′2 as we need it for the preservation.It remains to show that these de�nitions of A2, A2, S2, and A

′2 satisfy all conditions

we need for preservation.

With (4.183), (4.190), and (4.191) we may apply typing rule TLetExp and obtain

P ;A2 e e2 : t � A′2

It remains to show the droppability to complete the con�guration. By (4.193)and (4.180) we get P ;A′

2�dom(A′1) e L : t′′′ � A′′′ with ρ(A′′′) and, by the de�nition of

Lx, there exists no l such that l ∈ Lx∧l ∈ dom(A′2)\dom(A′

1), which yields togetherwith (4.189) that all new references are directly cleaned up and therefore P ;A′

2 e

L : t2 � A′′′2 with ρ(A′′′

2 ) holds. Now, with (4.184), (4.192)-(4.193), and (4.186)-(4.187) we have all constraints for preservation.

• Case RJoinExp: e1 ≡ joinw = l.f from e

e2 ≡ joinw = l.f from e′

98

Page 111: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

By inversion of e1 we get

P ;A, w : t′w, l : t′l e e : t � A′, w : t′′w, l : t′′lA1 = A,w : tw, l : tl

tl = t′l |f txt′w � tw | tx

A′1 = A′, w : t′′′w , l : t′′′lt′′′l = t′′l |f t′x (4.194)

t′′w � t′′′w | t′x (4.195)

tw � t′′′wtx � t′x

For similar reasons as used for the RJoinLoc base-case, transferring the capabil-ities with tx from tl to tw does not tear apart aliases-ok for these locations, weapply Lemma 4.17 and then Lemma 4.19 to show this claim. Therefore, we maystate

P ;A1;A,w : t′w, l : t′l c 〈e;L ∪ {w, l};S1〉 : t � A′, w : t′′w, l : t′′l

Droppability for the resulting environment with the given context holds obviously,as we have L∪{w, l} and the droppability invariant holds for A′ by the con�gurationfor e1 and A′1 in context L. Then, by induction

P ;A2;A∗2, w : t∗′

w , l : t∗′

l c 〈e′;L ∪ {w, l};S2〉 : t � A∗′

2 , w : t∗′′

w , l : t∗′′

l

with

A1 w A2

A A∗′

2 , w : t∗′

w , l : t∗′

l �dom(A′)ρ� A′, w : t′′′w , l : t′′′l

A,w : t′w, l : t′l � A∗2, w : t∗′

w , l : t∗′

l

By Theorem 4.32 and Lemma 4.41 we get t∗′

w � t∗′′

w � t′′w and t∗′

l � t∗′′

l � t′′l . Now,

99

Page 112: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Figure 4.4 Initial type relations

tw’

tw’’t w tx

tw’’’ tx’

tl

tl’tl’’’

tl’’

with

A′2 = A∗′

2 , w : t2′′′

w , l : t2′′′

l

A2 = A∗2, w : t2w, l : t2lt2l = t∗

′l |f t2x (4.196)

t∗′

w � t2w | t2x (4.197)

t2′′′

l = t∗′′

l |f t2′

x (4.198)

t∗′′

w � t2′′′

w | t2′x (4.199)

t2w � t2′′′

w (4.200)

t2x � t2′

x (4.201)

we have P ;A2 e e2 : t � A′2.

It remains to show that there exist such t2x, t2′x , t

2l , t

2′′′l , t2w, and t

2′′′w such that (4.196)-

(4.201) and the other invariant conditions for preservation all hold. Among them,we have to prove for A′2�dom(A′

1)ρ� A′1 that

t2′′′

wρ� t′′′w

t2′′′

lρ� t′′′l

The development and �nally presentation of this part of the proof pro�ts from agraphical representation on the splitting and subactive relations given. Figure 4.4

100

Page 113: Java(X) A Type-Based Program Analysis Framework

4.11 Preservation

Figure 4.5 Type relations with new variables

tw’

tw’’t w

tx

tw’’’ tx’

tx2

tl

tl’

tl’’’

tl’’

tw*’

tw*’’t w 2

tw2’’’

tl*’

tl*’’

t l2

tl2’’’

shows the initial relation of types provided bye TJoin, with the according vari-able naming from this lemma. Solid arrows represent splitting or type access,respectively. Dotted arrows represent subactive types. In addition, we ignore thedi�erence of type access and splitting, which is supported by Lemma 4.17. Duringthe proof we use the correct relations.

As we just have to prove the existence of such variables, we add some constraintsthat help us to solve the puzzle. First, as this reduction step potentially uses someof the capabilities we transfer from tl to tw via tx, we now only need to transferless capabilities, but in principle the same as we still need them. Therefore, wedecide to set t2x such that tx � t2x. It is obvious that there exists such a t2x. Next,whether typing e or e′, we have to transfer the same capabilities backwards via t′x,respectively t2

′x , and therefore decide to set t2

′x = t′x. Again, this obviously yields a

well-formed type for t2′

x .

Adding these two decisions and all the above constraints and new variables to oursplitting and subactive graph, we obtain Figure 4.5. Here, gray arrows denotethe relations we have to prove and black arrows the given relations. On the �rstsight, we realize that the Figure 4.4 is symmetric and looks quite organized, whileFigure 4.5 loses a lot of the intuitive organization. The graph does not distinguishbetween the subdroppability and subactivity, proof technical they are similar, wetake the details into account throughout the proof.

Throughout the following proof we use transitivity, Lemma 4.9, and Lemma 4.24.The graph directly supports this, we simply follow the directed arrows, whereneeded.

We �rst take a look at the left-hand side of the graph. As t′′w � t′′′w | t′x and t∗′′

wρ� t′′w

hold, we get via Lemma 4.22 a t2′′′

w with t∗′′

w � t2′′′

w | t′x and t2′′′

wρ� t′′′w .

101

Page 114: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

Next, we show the existence of t2w. As t∗′

w � t∗′′

w and t∗′′

w � t2′′′

w | t′x, there exists asplitting t∗

′w � t2w | t2x such that t2x � t′x and t2w � t2

′′′w . As t2w is free besides having

more capabilities than t2′′′

w , the resulting t2x can have any capabilities that satisfytx � t2x � t′x.On the right hand side of the graph we have t′′′l = t′′l |f t′x and t∗

′′l

ρ� t′′l , whichyields with t2

′′′l = t∗

′′l |f t′x a t2

′′′l with t2

′′′l

ρ� t′′′l . To prove that such a t2′′′

l exists,thus the possibility to combine t′x and t∗

′′l , we take a look at t2l . As soon as a we

proof the existence of t2l with t2l = t∗′

l |f t2x and t2l � t2′′′

l we trivially have throught∗

′l � t∗

′′l and t2x � t′x that t2

′′′l exists and is well-formed.

Now we take a look at t2l , which (among other properties) has to satisfy t2l � t2′′′

l .In addition, we want to keep t2x as �exible as possible and only use the propertythat it satis�es tx � t2x � t′x, which is one of our earlier de�nitions. First, we haveto prove that there exists a t2l that combines t∗

′l and t2x without combining illegal

activities. As t2x is split of t∗′

w (see above), it su�ces to take a look at the aliases-okrelation. By induction, aliases-ok(l,A2, A

∗2,S2) and aliases-ok(w,A2, A

∗2,S2) both

hold. In addition, as the user code does not have any join expression, e1 had tobe introduced by a reduction step, namely RAcc, as this is the only reductionstep introducing a join. This means, that we initially had S(l).f = S(w) for thecorresponding store. We have two possibilities, either we still have S2(l).f = S2(w),or S2(l).f 6= S2(w). The �rst case, S2(l).f = S2(w), yields in combination withaliases-ok and Lemma 4.17 that ∃t2l : t2l = t∗

′l |f t2x. Now, whenever S2(l).f 6= S2(w)

we know that this change occurred by a set, as this is the only possibility to changethe store. In fact, only set l.f = y in ... is able to uncouple l.f from w. As we haveshown in the RSet-case, this means that the corresponding type for w may not beactive, as all capabilities lie in the access path of l.f , i.e. O/♦ t t

∗′w , which includes

the case for �eld update with semiactive. In that case we have O/♦ t t2x, too, and

�nally ∃t2l : t2l = t∗′

l |f t2x. Both cases yield a suitable t2l while t2x stays as �exible aswe need it for t∗

′w .

This already provides A A′2�dom(A′1)

ρ� A′1. It remains to show that the aliases-ok -relation and droppability hold.

Droppability holds trivially as we have t2′′′

lρ� t′′′l (same for w) and Droppability

for A∗′

2 is valid by the induction hypothesis.

aliases-ok is similar to the RJoinLoc, just the reverse as we use the relations thatwe obtain by the induction hypothesis.

• Case RAccNull, RSetNull, RNullCall: These reduction rules do not reduceto an expression and therefore do not meet the preconditions for preservation.

End case distinction over ↪→.This concludes the preservation proof.

102

Page 115: Java(X) A Type-Based Program Analysis Framework

4.12 Progress

4.12 Progress

The next step is to provide progress. That is, we show that any well-typed expres-sion can be reduced, as long as it is not already a value.

Theorem 4.47 (Progress). Suppose P ;A;A c 〈e;L;S〉 : t�A′ and null-free-join(e).Then either e is a value, or there exists 〈e′;S ′〉 such that P ` 〈e;S〉 ↪→ 〈e′;S ′〉, orP ` 〈e;S〉 ↪→ 〈error: dereferenced null,S〉.

Proof. By structural induction on e.Case distinction over the structure of e.

• Case e ≡ v: As e is already a value, we need no further reduction.

• Case e ≡ new` c(v): Then we apply RNew. The premise holds trivially for well-typed programs, which is part of the con�guration judgment c.

• Case e ≡ v.m(v): Depending on v either RNullCall or RCall may be applied.The premise of RCall holds for every well-typed program by the con�gurationand the fact that v 6= null.

• Case e ≡ letx = v.f in e1: As above, we apply either RAccNull or RAcc. Thepremise again holds trivially.

• Case e ≡ set v.f = v in e1: Similar, we apply RSetNull or RSet.

• Case e ≡ letx = e1 in e2: Depending on e1 we use either RLetLoc or RLetExp.The premise of RLetExp holds by the induction hypothesis.

• Case e ≡ if v then e1 else e2: Either RCondLoc or RCondNull is applicable.

• Case e ≡ join v = v′.f from e1: As null-free-join(e) either RJoinLoc or RJoin-Exp is applicable, depending on e1. Again, the premise of RJoinExp holds by theinduction hypothesis.

End case distinction over the structure of e.

4.13 Soundness

Finally, we combine progress and preservation for type soundness in the usual way.Now we need the two invariants (JF and null-free-join) that we extracted fromprogress and preservation, too. To initially establish these invariants we formulatetype soundness direct for the main expression of a program.

De�nition 4.48 (Reduction Sequence). Let P ` 〈e0;S0〉 ↪→ 〈e1;S1〉, P ` 〈e1;S1〉 ↪→〈e2;S2〉, . . . , P ` 〈en;Sn〉 ↪→ 〈en+1;Sn+1〉, . . . be a, possibly in�nite, reduction se-

quence. Then we denote

P ` 〈e0;S0〉 ↪→∗

103

Page 116: Java(X) A Type-Based Program Analysis Framework

4 Type Soundness

for an in�nite reduction sequence and

P ` 〈e0;S0〉 ↪→∗ 〈em;Sm〉

for a �nite reduction sequence from e0 to em with the according store S0 and Sm

and

P ` 〈e0;S0〉 ↪→∗ 〈error: dereferenced null〉

for a reduction sequence that ends with an error.

Theorem 4.49 (Soundness). Let P = defn e and ` P . Then P ; ∅; ∅ c 〈e; ∅; ∅〉 :t � A′ and there exists a reduction sequence such that either

• P ` 〈e; ∅〉 ↪→∗ 〈v;S〉, that is the reduction of e ends with a value

• P ` 〈e; ∅〉 ↪→∗ 〈error: dereferenced null〉, that is there exists a reduction sequence

that reduces e to an error

• P ` 〈e; ∅〉 ↪→∗, that is the reduction sequence of e does not terminate.

Proof. First, we show that given P = defn e and ` P we have P ; ∅; ∅ c 〈e; ∅; ∅〉 :t � A′. By ` P we get for some type t P ; ∅ e e : t � ∅, this yields that fv(e) =∅. All other premises for the con�guration follow by the emptiness of the initialstore, environment and Urtype assumption. Typing of the program ` P also yieldsjoin-free(e), therefore JF(e) and null-free-join(e) hold trivially. With Lemma 4.6and Lemma 4.7 we know that these properties still hold next to any reductionstep in the reduction sequence starting with e that does not end with an error.Therefore, we know by Theorem 4.47 that there exists a reduction rule to reduce e.In addition, by Theorem 4.46 we know that, whenever this reduction does not yieldan error, there exists a store, environment and Urtype assumption such that thecon�guration judgment holds for the reduced expression. By multiple applicationof progress and preservation we get that the reduction sequence either ends withan value, an error, or is in�nite.

In addition to the above soundness theorem, Java(X) enjoys some more proper-ties. By the presented preservation theorem we also have that during any reductionsequence:

1. No reference is dropped that still holds nondroppable activity annotation.

2. There exists at most one active reference on any location at any time duringthe reduction sequence.

3. No active references are generated besides the initial new expression of thisobject.

104

Page 117: Java(X) A Type-Based Program Analysis Framework

4.13 Soundness

4. No activities are passed from one to another reference besides the speci�edcapability lending.

All of theses properties are part of the invariant of the preservation Theorem 4.46.

105

Page 118: Java(X) A Type-Based Program Analysis Framework
Page 119: Java(X) A Type-Based Program Analysis Framework

5Modular Type Checker

This chapter presents the constraint system for a modular type checker for Core-Java(X). Besides the increase in safety, Java(X) should also be easy to use. Con-sequently we want to avoid that the user himself has to provide more annotationsthan absolutely needed. Especially the splitting of references or capability lendingmay obscure the program code. Therefore, we provide a modular type inferencethat is in particular capable of inferring splitting and lending of capabilities. Ofcourse, the inference is not capable to infer the method signatures, especially theuser-de�ned restrictions. Therefore, we present a modular system that infers theoptimal splitting of references, that is the one that allows the most programs totype correctly, and additionally checks the correct types of methods. The inferencechecks each method body separately, using the user-de�ned method signatures.That is, the only place where a re�nement designer has to provide annotationsinside the program code are the method signatures. Of course, this implies thatthe designer also has to provide the partially ordered value annotation set for eachclass that uses the annotations (Xc ,≤), the droppable annotations ρc parts and thepersistent annotations for new objects and null, Rnew

`,c , Rnullc .

A programmer who uses a library that imposes restrictions via method signa-tures does not need to use the annotations at all. Still, the programmer gets usefulerror messages whenever he violates a stated safety constraint or whenever he il-legally uses aliases, for instance by modifying the same �eld through two di�erentreferences.

Due to a new, purely technical, activity annotation that we introduce for theconstraint system, the prototype implementation1 of the modular type inferenceis capable of pinpointing the location, where splitting or capability lending is notpossible, even if this occurs far in front of, for instance, a method call that �nallyneeds the capabilities. The implementation uses a syntax that is closer to theJava-syntax and omits some features, for example the summary value annotation.

1http://proglang.informatik.uni-freiburg.de/projects/access-control/

107

Page 120: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.1 Extended type syntax.

�eld type s ::= σ | 〈aa, t〉annotated type t ::= τ | 〈va, u〉 | µτ.tsimple type u ::= ν | c{f : s}

activity annotation aa ::= α | M(va) | ♦ | O | ◦(va)value annotation va ::= β | a

environment A ::= ∅ | A, x : τ

a ∈ Xc σ, τ, ν, α, β ∈ TypeV ar

It merely serves as proof-of-concept for the presented system.

Chapter Outline We �rst extend the type syntax with both, variables and a possi-bility for users to provide recursive types. Next, with the extended type syntax, weprovide a constraint system for a modular type inference. Several constraint rewrit-ing rules solve a given constraint set and provide an algorithm for our modular typechecker. Finally, we check whether all types generated during the constraint solvingare well-formed.

5.1 Constraint System

We avoid introducing new syntax shortcuts for types and annotations and thereforeextend the de�nitions from Chapter 3. Figure 5.1 introduces the type syntax withvariables. We also introduce variables for the annotations, which yields a minorchange: From now on a denotes a user-de�ned value annotation, as we also needa variable for value annotations and we want to keep va for value annotations ingeneral. Any value annotation a is drawn from the user-de�ned annotation set asbefore. For the constraint system, the environment maps variables to type variables,not explicit types any more. The most important change to the type syntax forthe user is the new syntax construct for annotated types, µτ.t. As the user hasto provide the method signatures and therefore provides types himself, we need ameaningful way to provide recursive types. For that reason we introduce the µτ.tnotation (analog to the Pierce text book [36, Chapter 21.8]). With µτ.t the userhimself introduces a type variable τ for t and may now use this variable inside ofthe de�nition of t to generate a recursive type. As all types have to be describedby the user in this way, we directly obtain that all types we use are regular.

108

Page 121: Java(X) A Type-Based Program Analysis Framework

5.1 Constraint System

Figure 5.2 Constraint syntax.

C ::= C,C| ∅ True| false False| σ l s Equal| τ l t| ν l u| β l va| α l α| aa ≺ α Restrict aa| ρsσ Droppable| ρtτ| χ � χ |? χ χ ∈ {σ, τ, α} Splitting| null ` τ Null| ν Cw f ← τ � ν Write �eld| τ =? τ |f τ Access �eld| M ` τ Fully active type| ♦ ` τ Fully semiactive type| τ := τ ↓C τ τ Apply E�ect| σ := σ ↓C σ σ| α := α ↓C α α| β <: β Value annotation subtyping

The last change is the introduction of a new activity annotation: ◦(va). Thisunknown active activity annotation carries, like M, a value annotation. We needthis new activity annotation for the constraint solving algorithm and will describeit in more detail as soon as we use it.

All type variables are drawn from TypeV ar. Still, similar to the previous chap-ters, we strictly separate the type variables for simple types (ν), annotated types(τ), �eld types (σ), value annotations (β), and activity annotations (α).

Next, Figure 5.2 introduces the syntax for the constraints that are used by theinference system to obtain and verify types for expressions and variables. A set ofconstraints C is a sequence of several constraints, where ∅ models true and falsea constraint set that has no solution. The constraint set is always unsorted, wemay rearrange the constraints as we need them. Equality constraints always havea type variable on the left hand side. Activity annotations are in addition onlyrestricted via ≺ and not set equal to a concrete activity annotation. The reason for

109

Page 122: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

the special treatment of activity annotations will be revealed before long.

The constraints for droppability, splitting, null typing, writing and accessing�elds, full activity and e�ect application all directly model the corresponding prop-erties from the formal type system and therefore have a similar syntax as the formalsystem. The user-de�ned value annotation subtyping is also modeled with a con-straint.

Before we provide the constraint generating type system, we introduce someauxiliary relations, similar to the formal system. Some of these relations alreadygenerate constraints.

Notation Whenever we use C inside an inference rule that generates constraints,C denotes the sequence C1, C2, ..., Cn of all potential constraints in the premise ofthis inference rule.

As a convention, the right-hand side of any equivalence constraint never containsa concrete type, but always type variables. Only class names and �eld types mayoccur on the right-hand side for a simple type equivalence constraint. That is, wealways have equivalence constraints like ν l c{f : σ} and so on. In addition, forevery constraint aa ≺ α we have a concrete activity annotation and no variableon the left-hand side. We keep these syntax restrictions throughout the constraintsystem. This signi�cantly simpli�es the constraint solving later on.

Figure 5.3 introduces the encoding relation. τ " t � C encodes type t into typevariable τ and yields the constraint set C that contains the according constraintsto ensure the equivalence of t and τ . According to the above convention for equiv-alence constraints, we use this relation to encode user-de�ned method signaturessuch that we obtain equivalence constraints that have variables on the right-handside wherever possible. Every concrete type is assigned a new type variable withthe according equivalence constraints. The relation introduces for every level newvariables, except for user-de�ned recursive types via µτ.t which directly use thisvariable.

The method lookup uses this encoding to generate type variables for the methodtypes with the according constraints for all new variables. To simplify things, weassume that all method names throughout a given program are distinct. That waywe do not need the class name of the caller for the method lookup any more.

Figure 5.4 provides e�ect application for the constraint system. The e�ect appli-cation simply adds a constraint for every single e�ect application on types. Solvingthese constraints is part of the constraint rewriting rules that we provide later. Inaddition, this Figure provides a possibility to state that two environments have tobe equal, which we need for the conditional statement.

That is all we need for the constraint generating type system as we provide it inFigure 5.5. The judgment P ;A Ce e : τ � C;A′ types expression e under programP and environment A. Instead of a concrete type, we generate a type variable

110

Page 123: Java(X) A Type-Based Program Analysis Framework

5.1 Constraint System

Figure 5.3 Encoding and method lookup.

Encoding:

All variables have to be su�ciently fresh!

C1 ≡ β l va C2 ≡ τ l 〈β, ν〉 ν " u � C3

τ " 〈va, u〉 � C

C1 ≡ ν l c{fi : σi} σi " si � C2i

ν " c{fi : si} � C

C1 ≡ σ l 〈α, τ〉 τ " t � C2 α" aa � C3

σ " 〈aa, t〉 � C

C1 ≡ M(β) ≺ α C2 ≡ β l aα" M(a) � C

C ≡ O ≺ αα" O � C

C ≡ ♦ ≺ αα" ♦ � C

τ " t � C

τ " µτ.t � Cτ " τ � C

mtype:

τ " t � C1

τi " ti � C2i τ ′i " t′i � C3i t [t1 t′1]m(ti t′i xii∈{2,...,n}

) { e } ∈ meth

mtypeCSP (m) = τi τ ′i → τ � C

Figure 5.4 E�ect application and environment equivalence.

E�ect Application:

A A := A ↓ null : τi τ ′i � ∅

vj = x C1 ≡ τ := τ ′ ↓C τj τ ′j A A := A′, x : τ ↓ vi : τi τ ′ii6=j

� C2

A A := A′, x : τ ′ ↓ vi : τi τ ′i � C

Environment equivalence:

` ∅ l ∅ � ∅C1 ≡ τ ′ l τ ′′ ` A1 l A2 � C2

` A1, x : τ ′ l A2, x : τ ′′ � C

111

Page 124: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.5 Constraint Generating Expression Typing.

CSMultiVal

P ;A0 Ce v1 : τ ′1 � C1;A1 ... P ;An−1 Ce vn : τ ′n � Cn;An

P ;A0 Ce vi : τi � C;An

CSVar

C ≡ τ � τ1 |? τ2P ;A, x : τ Ce x : τ1 � C;A, x : τ2

CSNull

C ≡ null ` τP ;A Ce null : τ � C;A

CSNew

Rnew`,c (va) P ;A Ce v : τ � C1, A

C2 ≡ τ l 〈β, ν〉 C3 ≡ β l va C4 ≡ ν l c{fi : σi} C5i ≡ σi l 〈αi, τi〉P ;A Ce new c(v) : τ � C;A′

CSAcc

C1 ≡ τy =? τ′y |f τx

P ;A, y : τ ′y, x : τx Ce e : τ � C2;A′, y : τ ′′y , x : τ ′x C3 ≡ τ ′′′y =? τ′′y |f τ ′x

P ;A, y : τy Ce letx = y.f in e : τ � C;A′, y : τ ′′′y

CSSet

C1 ≡ ν Cw f ← τv � ν ′ P ;A Ce v : τv � C2;A′, x : τ ′xC3 ≡ τ ′x l 〈β, ν〉 P ;A′x : τ ′′x Ce e : τ � C4;A′′ C5 ≡ τ ′′x l 〈β, ν ′〉

P ;A Ce setx.f = v in e : τ � C;A1, v : τ ′′

CSCall

P ;A Ce vi : τi � C1;A′

A A′′ := A ↓ vi : τi τ ′i � C2 mtypeCSP (m) = τi τ ′i → τ � C3

P ;A Ce v1.m(vii∈{2,..,n}) : τ � C;A′′

CSLetExp

(∀v) e1 6= v C3 ≡ ρt(τ ′1)P ;A Ce e1 : τ1 � C1;A1 P ;A′1, x : τ1 Ce e2 : τ2 � C2;A2, x : τ ′1

P ;A Ce letx = e1 in e2 : τ2 � C;A2

CSLetVar

P ;A Ce v : τ1 � C1;A1

P ;A1, x : τ1 Ce e : τ � C2;A2, x : τ2 A A3 := A2 ↓ v : τ1 τ2 � C3

P ;A Ce letx = v in e : τ � C;A3

CSCond

P ;A Ce v : τ1 � C1;A1

P ;A Ce e1 : τ2 � C2;A2 P ;A Ce e2 : τ3 � C3;A3 A′ = A2 ]A3

P ;A Ce if v then e1 else e2 : τ ′ � C;A′

112

Page 125: Java(X) A Type-Based Program Analysis Framework

5.2 An Informal Account on Constraint Solving

that is constrained with the constraint set C. As we want a unique constraintset, we ensure that these typing rules are syntax directed. The only place wherethe formal system is not already deterministic, according to Lemma 4.5, is thestandard let-expression. To this end, we added the premise that e1 must notbe a value in CSLetExp. This does not restrict the constraint system, as thesoundness proof of the previous chapter already includes that, whenever we maytype P ;A t letx = v in e � A′ with typing rule TLetExp, there exists a typingwith TLetVar for the same expression, too.

Whenever there exists a solution for the constraints in C, we obtain a correct typefor τ and therefore e is type correct under P and A, yielding a new environment A′.All constraint generating typing rules are closely related to their counterparts inthe formal system. Instead of restricting the rules with premises, we simply collectall these restrictions in the according constraints.

We omit the intermediate typing judgments as they may not appear in usercode and we are only interested in the typing of user expressions. In addition, theconstraint generating typing omits well-formedness, which we check separately forevery type variable whenever we solved all constraints in the constraint set.

5.2 An Informal Account on Constraint Solving

The main challenge for our constraint solving arises from the splitting relation, nomatter whether it is used for capability lending or variable access. Whenever weaccess a variable, we split its capabilities: one part is used directly, the other oneleft in the environment for later use. Unfortunately, during the splitting, we do notyet know which part needs which capabilities to succeed. That is, during splittingof an active reference with M(va) we have no way to decide which splitting rule,SplitActFst or SplitActSnd, is best used. Therefore, the naive approach wouldbe to randomly split the capabilities and backtrack to the splitting point, wheneverthe constraint solving algorithm fails to assign correct types to the variables. Thiswould result in an exponential run-time for the splitting constraints, which is highlyine�cient as every variable access triggers a splitting of this variable.

To avoid such a backtracking, we improve the constraint solver. Instead of split-ting an active reference directly at the access into one active and another inactiveone, we defer this decision until one of them really needs or disallows active �elds.For that reason, we introduce a new activity annotation that only serves the pur-pose of avoiding the aforementioned backtracking: The unknown active activityannotation ◦(va). Whenever we have to split a reference with an active capability,instead of directly applying SplitActFst or SplitActSnd, we split it into twounknown active references ◦(va). To do so, it is crucial that we may relay on thefact that splitting does not change the inside value annotation. Now, whenever an

113

Page 126: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

unknown active activity annotation is used later on, we check if the use imposesa restriction in either direction, that is, whether the use of the variable is onlypossible for active or inactive references. The latter may happen, for instance, ifwe want to drop a reference or hand it to a method call that expects an inactivereference. As soon as we are able to further restrict an activity annotation fromunknown active to active or inactive, we also have the corresponding annotation forthe other unknown active that was introduced earlier. While the implementationkeeps explicit track of splitting partners, for the formal system it su�ces to keepthe original splitting constraint. As the decision whether a reference may be activeor not depends on the value annotation that is part of the active activity annotationwe take the value annotation along in both unknown active references.

We discuss the detailed mechanism to substantiate an unknown active referencetowards active or inactive in more detail with the according constraint rewriterules. For this system to work, it is crucial that new values are not assigned toexisting variables. Basically, the algorithm only assigns a value to any variable once.Whenever we have two assignments to one variable, we expect all of the assignedparts to be equivalent. For instance, if we have the two constraints τ l 〈β, ν〉 andτ l 〈β′, ν ′〉 in one constraint set, we unify them and expect that β l β′ and ν l ν ′

hold. The only change in the assignment to a variable is the change from unknownactive to active or inactive. That is why we choose here the constraint aa ≺ αinstead of · l · for activity annotations.

To obtain a partially ordered set with least element, we further add ⊥ to thepossible assignments for an activity annotation variable. The partial order is de�nedby the following graph:

OM

◦ ♦

The rewrite rules are only allowed to exchange an activity restriction aa ≺ αupwards in this graph and ⊥≺ α states that there are no restrictions yet. Basically,every activity variable α without further restriction may get a constraint ⊥≺ α.

114

Page 127: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

5.3 Constraint Solving

As we do not need backtracking to solve the constraint set that we collect with theabove typing relation, we are able to solve the constraints by providing reasonablerewrite rules. These rewrite rules always take two constraint sets and return twomodi�ed sets: D;C ; D′;C ′. The �rst set (D and D′) represents all constraintsthat we already took care of, the second one (C and C ′) the constraints that weneed to solve. For instance, the rewrite rule

D;C,C1 ; D,C1;C,C ′1

is applicable, if C1 is part of the second input constraint set. Then, we rewrite C1

to C ′1 and add this constraint to the initial constraint set. We only add C ′

1 if itis neither already part of C nor of D. Here, we add C1 to D to state that thisconstraint has already been taken care of. That way we prevent the system fromrunning into circular rewrite rule sequences, solving constraints and adding themback in later on. A rule that does not add the new constraint to the set on theright-hand side is still applicable, we just omit adding the new constraint. Somerewrite rules imply additional premises. We state these premises on the left-handside of a rewrite rule without comma separation.

Trivial constraints, for instance equality constraints, are not added to the solvedconstraints set as these constraints are part of the solution themselves. The goalof the algorithm is to eliminate all constraints except one equivalent constraint forevery type variable on the left-hand side, and one restriction constraint aa ≺ α forevery activity annotation variable. Still, due to null and the according constraintnull ` τ we have to take care of some remaining constraints involving a null-typewhen we �nished the rewriting for all other constraints. In addition, the unknownactive annotation we introduced in the previous Section may delay some constraints,which we will take care of in a special rewrite rule at the very end.

Before we get to the rewrite rules, we introduce some more auxiliary relations.

Figure 5.6 introduces a relation to access known equivalences in a given constraintset. Whenever there is no such equivalence constraint in the set, fresh variables haveto be used for a new equivalence constraint. The judgment C ` ν l c{fi : σi} hasto be used with caution: Whenever it is not yet part of the constraint set, weguess a class name and �eld names, which is obviously not desired. Therefore, theconstraint rewriting algorithm will only use this judgment when the class name andaccording �eld names are known out of the context, i.e. whenever we have ν l ν ′

and ν l c{fi : σi} we may safely set C ` ν ′ l c{fi : σ′i} for fresh σ′i. The last tworules provide a similar access for activity annotations that are restricted by aa ≺ α.Whenever there is no such constraint in the set, we may safely introduce one thatdoes not restrict the annotation (⊥≺ α).

115

Page 128: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.6 Variable access or fresh variable generation.

σ l 〈α, τ〉 ∈ CC ` σ l 〈α, τ〉

∀α′, τ ′ : σ l 〈α′, τ ′〉 /∈ C freshC(α) freshC(τ)C ` σ l 〈α, τ〉

τ l 〈β, ν〉 ∈ CC ` τ l 〈β, ν〉

∀β′, ν ′ : τ l 〈β′, ν ′〉 /∈ C freshC(β) freshC(ν)C ` τ l 〈β, ν〉

ν l c{fi : σi} ∈ CC ` ν l c{fi : σi}

∀c′, σ′i, f ′i : ν l c′{f ′i : σ′i} /∈ C freshC(σi)

C ` ν l c{fi : σi}

aa ≺ α ∈ CC ` aa ≺ α

∀aa : aa ≺ α /∈ CC `⊥≺ α

Figure 5.7 Incompatible types.

τ l 〈β, ν〉 ∈ C ν l c{fi : σi} ∈ Cτ ′ l 〈β′, ν ′〉 ∈ C ν ′ l c′{f ′i : σ′i} ∈ C∃i : fi 6= f ′i ∨ c 6= c′ ∨ C ` σi 6↔ σ′i

C ` τ 6↔ τ ′

σ l 〈α, τ〉 ∈ Cσ′ l 〈α′, τ ′〉 ∈ CC ` τ 6↔ τ ′

C ` σ 6↔ σ′

Theses variable access rules are the key to avoid in�nite variable construction. Assoon as a variable is already assigned in an equivalence constraint, this assignmentis used and no further variables are generated. As circular structures point at somepoint to an existing variable, these circles are broken up by theses rules, too.

Next, the relation in Figure 5.7 detects incompatible types, as the programmermay incorrectly enter them in method signatures. Whenever two type variables areused in a context that expects these variables to have the same underlying Javatype, this relation may detect violations by checking whether the class name and�eld names are equivalent or not.

Droppability of value annotations is a user-de�ned property. The �rst relationin Figure 5.8 speci�es the according droppability for value annotation variables. Itis important to notice that we never generate a variable β without generating anaccording constraint β l a. Therefore, droppability is always decidable wheneverwe want to check it for a particular value annotation variable.

Similar, the same Figure de�nes the check whether a value annotation satis�esthe user-de�ned predicate for null- references. As above, this relation is alwaysdecidable because we never introduce value annotation variables without according

116

Page 129: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

Figure 5.8 User-de�ned droppability and null.

β l a ∈ C a ∈ ρC ` ρva(β)

β l a ∈ C Rnullc (a)

C ` Rnullc (β)

equivalence constraint.

These are all auxiliary relations we need. Now we introduce the constraint rewrit-ing rules themselves.

Figure 5.9 introduces the rewrite rules for droppability constraints. The �rst ruleis applicable if the following three constraints are part of the second input constraintset: ρtτ , τ l 〈β, ν〉, and ν l c{fi : σi}, in any order. When this rule is applied,we add the droppable constraint to D, remove it from the input C, keep the twoequivalence constraints and add ρsσi for every �eld type variable to the constraintset C. That is, this �rst rule simply unrolls the type and adds constraints to checkdroppability of all �elds. The constraint ρtτ is added to the set of already solvedconstraints; whenever all �elds are droppable, the type itself is droppable, too.In order to remove the initial droppability constraint, it is of course crucial thatthe according equivalence constraint does not arbitrarily change by other rewritingrules.

We add the new droppability constraints to C only if it is not yet part of C or D.Whenever a constraint on the right-hand side of a rewrite rule is already part of Cor D, we still apply the rule but omit adding the according constraint. Therefore,every constraint is unique in C,D.

The second and third rules in this Figure are trivial: Whenever a �eld type carriesan inactive or semiactive activity annotation, it is droppable. The next rule, foractive references, is the �rst one that potentially may reveal a contradiction. Asstated above, droppability of value annotations (C ` ρva(β)) can be tracked at alltimes. If the annotation is not droppable, we directly return false and stop thealgorithm.

The last rule in Figure 5.9 is the �rst one to work with the unknown activeannotation. Whenever the according β is droppable, we keep the unknown activeassignment and check the inner type variable for droppability. But, if β is notdroppable, we know that this unknown active annotation has to be inactive inorder to satisfy this droppability constraint. Therefore, we tighten the restrictionof α and exchange ◦(β) ≺ α with O ≺ α.We only apply any of these rewriting rules for droppability constraints when we

already have the according equivalent constraints as part of the constraint set. Thefact that this does not result in a stuck rewrite state is discussed at the end of this

117

Page 130: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.9 Constraint rewriting for droppability constraints.

D;C, ρtτ,τ l 〈β, ν〉,ν l c{fi : σi}

;

D, ρtτ ;C, ρsσi,τ l 〈β, ν〉,ν l c{fi : σi}

D;C, ρsσ,σ l 〈α, τ〉,♦ ≺ α

;

D, ρsσ;C, ρtτ,σ l 〈α, τ〉♦ ≺ α

D;C, ρsσσ l 〈α, τ〉,O ≺ α

;

D, ρsσ;C, ρtτ,σ l 〈α, τ〉O ≺ α

D;C, ρsσ,σ l 〈α, τ〉,M(β) ≺ α

;

D, ρsσ;C, ρtτ,

σ l 〈α, τ〉,M(β) ≺ α

if C ` ρva(β)

false else

D;C, ρsσ,σ l 〈α, τ〉,◦(β) ≺ α

;

D, ρsσ;C, ρtτ,

σ l 〈α, τ〉,◦(β) ≺ α

if C ` ρva(β)

D, ρsσ;C, ρtτ,

σ l 〈α, τ〉,O ≺ α

else

118

Page 131: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

chapter.Next, Figure 5.10 contains the rewrite rules for e�ect application constraints.

The �rst rule checks that all involved types are compatible. If not, we have a directcontradiction and abort the constraint solving. Any relation on the left-hand sideof a rewriting rule has to hold in order to apply this rule. That is, C ` τj 6↔ τkis part of the precondition to apply the rule even if it is not a constraint itself.The existential quanti�er is a shortcut to prevent multiple similar rules for everyassignment of the quanti�ed variables.The second rule gets �rst the structure of one of the involved type variables out of

the constraint set. The following access relations check whether the other variablesalready have according equivalence constraints in C, if not, they are generated withfresh variables as stated in Figure 5.6. The structure, that is class name and �eldnames, are drawn from the �rst equivalence constraint that is already known andtherefore yields a correct type. Given all equivalence constraints, we rewrite thee�ect application constraint and replace it with the according e�ect applicationson the �eld types. In addition, as the value annotation is a consistent property, allvalue annotation variables for the annotated types have to be equivalent.The rule for e�ect application of �eld types simply decomposes the �eld type and

adds e�ect application constraints for the activity annotation and annotated type.With the last three rules for e�ect application of activity annotations, we imple-

ment the constraints as speci�ed in EAActive and EANonactive. Wheneverwe have constraint α := α′ ↓C α′′ α′′′ and know, via M(β) ≺ α′′ that EAActiveis applied, we ensure the according premises of this inference rules hold and there-fore add α′′′ l α to the constraint set. Similar, we implement the rewrite rule forEANonactive. Whenever we get an unknown active annotation ◦(va) we have towait until this activity annotation gets further speci�ed.Figure 5.11 de�nes the rewriting rules for �eld update constraints. Depending

on the activity annotation, we expect the type we overwrite to be fully active orfully semiactive. If it is inactive, we have a direct contradiction. The constraintsfollow directly from TypeUpdateAct and TypeUpdateSemiact. The last rulefor unknown active annotations changes this unknown active to active, otherwisethe �eld update is not possible. Afterwards, we may apply the previous rule foractive �elds.The rule for the update of active �elds implicitly states that τ and τj have a

compatible Java type by linking them together via the equivalence constraints.Otherwise, these variables would fail the check for well-formedness in the end if thetypes are not compatible.Figure 5.12 provides the rewrite rule for the �eld access constraint. As we never

introduce a �eld access constraint except with the according typing rule itself, weomit the �eld access constraint in the set of the solved constraints. If the �rst twotype variables are incompatible, we have a contradiction and return false, else we

119

Page 132: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.10 Constraint rewriting for e�ect application constraints.

∃ j, k ∈ {0, 1, 2, 3} :D;C, τ0 := τ1 ↓C τ2 τ3

C ` τj 6↔ τk

; false

∃ disjoint j, k, l,m,∈ {0, 1, 2, 3} :D;C, τ0 := τ1 ↓C τ2 τ3,

τj l 〈βj , νj〉,νj l c{fi : σji}C ` τk l 〈βk, νk〉C ` τl l 〈βl, νl〉C ` τm l 〈βm, νm〉C ` νk l c{fi : σki

}C ` νl l c{fi : σli}C ` νm l c{fi : σmi}

;

D, τ0 := τ1 ↓C τ2 τ3;C, σ0i := σ1i ↓C σ2i σ3i ,β0 l β1, ..., β2 l β3,τj l 〈βj , νj〉,τk l 〈βk, νk〉τl l 〈βl, νl〉τm l 〈βm, νm〉νj l c{fi : σji}νk l c{fi : σki

}νl l c{fi : σli}νm l c{fi : σmi}

D;C, σ0 := σ1 ↓C σ2 σ3

C ` σ0 l 〈α0, τ0〉,C ` σ1 l 〈α1, τ1〉C ` σ2 l 〈α2, τ2〉C ` σ3 l 〈α3, τ3〉

;

D,σ0 := σ1 ↓C σ2 σ3;C,α0 := α1 ↓C α2 α3,τ0 := τ1 ↓C τ2 τ3,σ0 l 〈α0, τ0〉,σ1 l 〈α1, τ1〉σ2 l 〈α2, τ2〉σ3 l 〈α3, τ3〉

D;C,α := α′ ↓C α′′ α′′′,M(β) ≺ α′′ ;

D,α := α′ ↓C α′′ α′′′;C,α′′′ l α,M(β) ≺ α′′

D;C,α := α′ ↓C α′′ α′′′,O ≺ α′′ ;

D,α := α′ ↓C α′′ α′′′;C,α′ l α,O ≺ α′′

D;C,α := α′ ↓C α′′ α′′′,♦ ≺ α′′ ;

D,α := α′ ↓C α′′ α′′′;C,α′ l α,♦ ≺ α′′

120

Page 133: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

Figure 5.11 Constraint rewriting for �eld update constraints.

D;C, ν Cw fj ← τ � ν ′,

ν l c{fi : σi},σj l 〈αj , τj〉,O ≺ αj

; false

D;C, ν Cw fj ← τ � ν ′,

ν l c{fi : σi},σj l 〈αj , τj〉,♦ ≺ αj

;

D, ν Cw fj ← τ � ν ′;C,♦ ` τj ,τ l τj ,ν ′ l νν l c{fi : σi},σj l 〈αj , τj〉,♦ ≺ αj

D;C, ν Cw fj ← τ � ν ′,

ν l c{fi : σi},σj l 〈αj , τj〉,M(β) ≺ αj ,τ l 〈β∗, ν∗〉

;

D, ν Cw fj ← τ � ν ′;C,M ` τj ,ρtτj ,σ′j l 〈α′j , τ〉, with freshC(σ′j), freshC(α′j)

ν ′ l c{fj : σ′j ; fi : σii6=j},

M(β∗) ≺ α′j ,ν l c{fi : σi},σj l 〈αj , τj〉,M(β) ≺ αj ,τ l 〈β∗, ν∗〉

D;C, ν Cw fj ← τ � ν ′,

ν l c{fi : σi},σj l 〈αj , τj〉,◦(β) ≺ αj

;

D;C, ν Cw fj ← τ � ν ′,

ν l c{fi : σi},σj l 〈αj , τj〉,M(β) ≺ αj

121

Page 134: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.12 Constraint rewriting for �eld access.

D;C, τ0 =? τ1 |fjτ

C ` τ0 6↔ τ1; false

∃ disjoint k, l ∈ {0, 1} :D;C, τ0 =? τ1 |fj

τ,

τk l 〈βk, νk〉,νk l c{fi : σki

},σkjl 〈αkj

, τkj〉,

C ` τl l 〈βl, νl〉,C ` νl l c{fi : σli},C ` σlj l 〈αlj , τlj 〉

;

D;C, τ0j � τ1j |? τ,β0 l β1,

σ0i l σ1i

i6=j ,α0j l α1j ,τ0 l 〈β0, ν0〉,ν0 l c{fi : σ0i},σ0j l 〈α0j , τ0j 〉,τ1 l 〈β1, ν1〉,ν1 l c{fi : σ1i},σ1j l 〈α1j , τ1j 〉

rewrite the constraint according to TypeAccess.The rules in Figure 5.13 simply unroll the type variables and check the fully ac-

tive/semiactive property recursively like TypeFullAct and TypeFullSemiact.Next, Figure 5.14 unrolls the splitting constraints down to splitting of activity

annotations. The �rst rule checks that all participating types are compatible. Thesecond rule is only applied if we already know the underlying structure of at leastone type, that is, the class name and �eld names for C ` νl l c{fi : σli} (same fork) are known and not arbitrary.To solve the splitting constraints for activity annotation variables, we introduce

a function f. This function takes three concrete activity annotations, that is, novariables, and returns, depending on this input either three new activity annota-tions, a check mark (X), a fail symbol ( ), or states that no change is applicable(�). The input represents the three activity annotations of a splitting constraintα � α′ |? α′′. Whenever we may further concretize the annotations, f returns theseannotations, for example the introduction of unknown active as stated in Section 5.2

f(M(va),⊥,⊥) = (M(va), ◦(va), ◦(va))

If the input already models one of the splitting rules SplitActFst, SplitActSnd,SplitInact, or SplitSemiact, the function returns the check mark, for instancewith SplitActSnd

f(M(va),O,M(va)) = X

For failure, that is, we have a direct contradiction and may not reach a correctsplitting rule anymore, the function returns the failure symbol, for instance ♦ may

122

Page 135: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

Figure 5.13 Constraint rewriting for full active type constraints.

D;C,♦ ` τ,τ l 〈β, ν〉,ν l c{fi : σi}C ` σi l 〈αi, τi〉

;

D,♦ ` τ ;C,♦ ` τi,♦ ≺ αi,τ l 〈β, ν〉,ν l c{fi : σi},σi l 〈αi, τi〉

D;C,M ` τ,τ l 〈β, ν〉,ν l c{fi : σi}C ` σi l 〈αi, τi〉

;

D,M ` τ ;C,M ` τi,M(βi) ≺ αi,τ l 〈β, ν〉,ν l c{fi : σi},σi l 〈αi, τi〉

Figure 5.14 Constraint rewriting for splitting constraints.

∃k, l ∈ {0, 1, 2} :D;C, τ0 � τ1 |? τ2

C ` τk 6↔ τl

; false

∃j, k, l ∈ {0, 1, 2}j 6= k 6= l 6= j :D;C, τ0 � τ1 |? τ2,

τj l 〈βj , νj〉,νj l c{fi : σji}C ` τk l 〈βk, νk〉,C ` τl l 〈βl, νl〉,C ` νk l c{fi : σki

},C ` νl l c{fi : σli}

;

D, τ0 � τ1 |? τ2;C, σ0i � σ1i |? σ2i ,βj l βk l βl,τj l 〈βj , νj〉,τk l 〈βk, νk〉,τl l 〈βl, νl〉,νj l c{fi : σji},νk l c{fi : σki

},νl l c{fi : σli}

D;C, σ0 � σ1 |? σ2

C ` σ0 l 〈α0, τ0〉C ` σ1 l 〈α1, τ1〉C ` σ2 l 〈α2, τ2〉

;

D,σ0 � σ1 |? σ2;C,α0 � α1 |? α2,τ0 � τ1 |? τ2,σ0 l 〈α0, τ0〉,σ1 l 〈α1, τ1〉,σ2 l 〈α2, τ2〉

123

Page 136: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

never be combined with M(va)

f(♦,♦,M(va)) =

And �nally, some splitting constraints may not get further concretized withoutadditional information,

f(◦(va), ◦(va), ◦(va)) = �

Figure 5.15 provides the complete de�nition of f(aa1, aa2, aa3). We omit thevalue annotation in this tabular. Whenever we introduce an unknown active or anactive annotation we already have one of them in the input and simply propagatethe value annotation it contains. If the input already contains two annotations withan inner value annotation and these do not match, the function also returns a failsymbol, for example

f(M(va), ◦(va ′),O) =

The four check marks result directly from the de�nition of the splitting relation inFigure 3.7. The rightmost column and lowermost part and lowermost row in everypart of the table model the behavior for any participating ♦. As SplitSemiact isthe only splitting rule involving semiactive, we may directly return (♦,♦,♦) or afailure whenever there exists an annotation that is not ⊥ or ♦. The initial or defaultvalue ⊥ is never returned by this function, we may either replace all occurrences orwe are missing too many information and keep the original assignments (and return�). The main purpose of this function is to propagate or concretize unknown activeannotations. As stated in the example above, splitting an active reference yieldstwo unknown active annotations. Whenever splitting involves an active annotationon the right-hand side, we directly obtain �nal annotations. That is on the left-hand side we need active, too, and the second annotation on the right-hand side inthis case is always inactive.

It is important to notice, that we do not generate any ◦(va) with f withoutknowing the va.Now, Figure 5.16 uses the above introduced function to rewrite the splitting

constraint for activity annotations. As this function needs concrete annotationsand no variables, we �rst get the annotations via C ` aa ≺ α, which returns theannotation or initializes it with ⊥ if there is no restriction of α in C yet. Any failof f(aa0, aa1, aa2) represents unsolvable constraints, therefore, we return false forthe constraint set. Whenever f(aa0, aa1, aa2) returns a check mark, we solved theaccording splitting constraint and may safely remove it. The rewriting, especiallywith unknown active annotations, is modeled with the last rule. We keep thesplitting constraint until it is fully solved.

Figure 5.17 eliminates the null constraint as soon as we know the class of theaccording type. NullType does not include any further restrictions, besides the

124

Page 137: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

Figure 5.15 Helper function to rewrite activity annotations (for consistent valueannotations only).

HH

HH

HHaa2 aa3

aa1 ⊥ ◦ M O ♦

⊥ ⊥ � ◦ ◦ ◦ M ◦ ◦ O O O ♦ ♦ ♦⊥ ◦ ◦ ◦ ◦ ◦ ◦ ◦ M ◦ ◦ O O O ⊥ M M O M M O M M O M ⊥ O � ◦ ◦ O M M O O O O ⊥ ♦ ♦ ♦ ♦ ♦ ♦ ♦◦ ⊥ ◦ ◦ ◦ ◦ ◦ ◦ M ◦ ◦ O O O ◦ ◦ ◦ ◦ ◦ � � O O O ◦ M M O M M O M M O M ◦ O ◦ ◦ O � M M O O O O ◦ ♦ M ⊥ M M O M M O M M O M ◦ M M O M M O M M O M M M O M M O M M O X M ♦ O ⊥ � ◦ O ◦ M O M O O O O ◦ ◦ O ◦ � M O M O O O O M M O M M O M X O O O O O O O O X O ♦ ♦ ⊥ ♦ ♦ ♦ ♦ ♦ ♦♦ ◦ ♦ M ♦ O ♦ ♦ ♦ ♦ ♦ X

125

Page 138: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.16 Constraint rewriting for activity splitting constraints.

D;C,α � α′ |? α′′C ` aa0 ≺ αC ` aa1 ≺ α′C ` aa2 ≺ α′′f(aa0, aa1, aa2) =

; false

D;C,α � α′ |? α′′C ` aa0 ≺ αC ` aa1 ≺ α′C ` aa2 ≺ α′′f(aa0, aa1, aa2) = X

;

D,α � α′ |? α′′;C, aa0 ≺ α,

aa1 ≺ α′,aa2 ≺ α′′

D;C,α � α′ |? α′′C ` aa0 ≺ αC ` aa1 ≺ α′C ` aa2 ≺ α′′f(aa0, aa1, aa2) = (aa, aa ′, aa ′′)

;

D;C,α � α′ |? α′′,aa ≺ α,aa ′ ≺ α′,aa ′′ ≺ α′′

Figure 5.17 Constraint rewriting for null constraints.

D;C, null ` τ,τ l 〈β, ν〉,ν l c{fi : σi}

;

D;C, τ l 〈β, ν〉,

ν l c{fi : σi}if C ` Rnull

c (β)

D; false else

126

Page 139: Java(X) A Type-Based Program Analysis Framework

5.3 Constraint Solving

well-formedness, which we will check at the very end of the constraint solvingalgorithm.Figure 5.18 introduces the last rules for the basic rewriting algorithm. These

rules eliminate duplicated equivalence constraints. The �rst rule renames any oc-currence of a variable that has to be identical to another variable. The next threerules ensure that whenever there are two equivalence constraints for one variable,the right-hand side is equivalent, too. The following two rules detect incompati-ble types. Finally, the last rule eliminates two constraints that restrict the sameactivity annotation variable β. Value annotations inside active or unknown activeannotations have to be identical as we introduce new variables whenever a typechange occurs. Whenever an activity annotation is restricted by unknown activeand either inactive or active, we may eliminate the unknown active in favor of themore precise activity annotation.To avoid in�nite unrolling of types, we treat these rewrite rules for equivalence

constraints with a higher priority and apply any of these rules whenever possible.Therefore, we prevent generating in principle the same constraints only with freshvariable names.Our goal is to rewrite the constraint set until there are only equivalence con-

straints, and for activity annotations the according restriction constraints left. Un-fortunately, due to null and unknown active, there may be several other constraintsleft, even if none of the above rewrite rules may be applied any more.Due to null ` τ , there may be any constraint involving annotated type variables

remaining. Still, as null may get any well-formed type, we will �rst take care of theremaining activity annotations. The restriction constraint ◦(β) ≺ α may preventrewriting of splitting or e�ect application for activity annotations. But, wheneverunknown active does not get concretized, there is no restriction that expects eitheran inactive or active annotation for this type. Therefore, we may safely choose anysuch activity annotation and set it to either M(β) or O. For this reason we applythe following rewrite rule once:

D;C, ◦(β) ≺ α ; D;C,O ≺ α

The choice towards inactive is arbitrary. We apply all other rewrite rules again,until there is no more rule applicable. Then, we take the next unknown activeannotation, if any is left, and repeat this procedure until all activity annotationshave a de�nite value, that is either O, ♦, or M(va). This also includes that thereexists no more ⊥.Now, for any unsolved constraint we have a constraint null ` τ for at least one of

the participating types. null may represent any well formed type. Obviously noneof the remaining constraints imposes any further restrictions, else this restrictionwould trigger a rewrite rule and introduce an equivalence constraint that satis�esthese restrictions. Therefore, we are �nished and the constraint set is solvable.

127

Page 140: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

Figure 5.18 Constraint solving for equivalence constraints.

D;C,χ l χ′ ; D[χ′ 7→ χ];C[χ′ 7→ χ] χ ∈ {σ, τ, ν, β, α}

D;C, σ l 〈α, τ〉,σ l 〈α′, τ ′〉

; D;C, σ l 〈α, τ〉,α l α′,τ l τ ′

D;C, τ l 〈β, ν〉,τ l 〈β′, ν ′〉

; D;C, τ l 〈β, ν〉,β l β′,ν l ν ′

D;C, ν l c{fi : σi},ν l c{fi : σ′i}

;D;C, ν l c{fi : σi},

σ l σ′

c 6= c′ or ∃i : fi 6= f ′iD;C, ν l c{fi : σi},

ν l c′{f ′i : σ′i}; D; false

a 6= a′

D;C, β l a,β l a′

; D; false

D;C, aa ≺ α,aa ′ ≺ α ;

D;C,M(β) ≺ αβ l β′

ifaa = M(β)aa ′ = M(β′) or aa ′ = ◦(β′)

D;C, ◦(β) ≺ αβ l β′

ifaa = ◦(β)aa ′ = ◦(β′)

D;C,O ≺ α ifaa = Oaa ′ = ◦(β′)

D; false else

128

Page 141: Java(X) A Type-Based Program Analysis Framework

5.4 Well-Formedness

Figure 5.19 Well-Formedness for the Constraint System

(∀β, ν) τ l 〈β, ν〉 /∈ CP ;C ` wf (τ)

τ l 〈β, ν〉 ∈ C C ` ν l c{fi : σi}fieldsP (c) = cifi β l a ∈ C a ∈ Xc

P ;C ` wf (τ)

σ l 〈α, β〉 ∈ C M(β) ≺ α ∈ Cτ l 〈β′, ν ′〉 ∈ C β l a ∈ C β′ l a′ ∈ C a ≤ a′

P ;C ` wf (σ)

σ l 〈α, β〉 ∈ C M(β) ≺ α ∈ C(∀β′, ν ′) τ l 〈β′, ν ′〉 /∈ C β l a ∈ C ∃a′ : a ≤ a′

P ;C ` wf (σ)

σ l 〈α, β〉 ∈ C ♦ ≺ α ∈ CP ;C ` wf (σ)

σ l 〈α, β〉 ∈ C O ≺ α ∈ CP ;C ` wf (σ)

5.4 Well-Formedness

It remains to check, whether all types that arise during the constraint solvingalgorithm are well-formed. We check well-formedness for every type variable forannotated �eld types. As we check all variables, we omit the recursive check inFigure 5.19. Unfortunately, due to the possible existence of null, there may remainsome uninitialized variables. Still, as these variables do not include any restrictions,we only have to check whether there exists a correct type for these variables.

Well-formedness is also responsible to check that �elds with a semiactive anno-tation are fully semiactive. Therefore, we �rst apply the following rewrite rule forevery semiactive restriction once.

D;C, σ l 〈α, τ〉,♦ ≺ α ;

D;C, σ l 〈α, τ〉,♦ ≺ α,♦ ` τ

Next, we solve the potential new fully semiactive constraints with the accordingrewrite rule of Figure 5.13.

Finally, we check the well-formedness for every type variable in the constraint setwith the inference rules of Figure 5.19. We do not check simple type variables, asthey themselves do not restrict the type. The �rst rule handles the aforementioneduninitialized types for null. The next rule for annotated types, P ;C ` wf (τ),implements all properties that are de�ned in WFType. Here we use the access

129

Page 142: Java(X) A Type-Based Program Analysis Framework

5 Modular Type Checker

relation of Figure 5.6 for simple types with potential lack of knowing the class ofthis type. Still, if this variable is not initialized, it is part of a null-reference.Therefore, we may safely choose any class type that is type correct.The next two rules handle �eld types with an active activity annotation, again,

there is one rule for the case of an uninitialized type. The well-formedness iscompleted with the two according rules for semiactive and inactive �eld types. Aswe already ensured that any semiactive type is fully semiactive, well-formednessholds as soon as we checked all variables of annotated and �eld types with this setof rules.

130

Page 143: Java(X) A Type-Based Program Analysis Framework

6Extensions

The formalization of CoreJava(X) covers the core expression language of Java 1.4and imperative �eld update. This section discusses the extensions necessary for thefull system Java(X) with inheritance, subtyping, and with constrained parametricpolymorphism over annotations in the style of HM(X) [34].

These extensions are already presented in our paper [14]. We have refrained fromdirectly adding these extensions to the formal system as this would add technicalcomplication and further complicate the presented soundness proof.

Chapter Outline First, we present structural subtyping on annotations. Next, weprovide the needed extensions for polymorphism and revisit some of the examplesof Chapter 2. The last extension we present introduces inheritance and interfaces.

6.1 Annotation Subtyping

Our paper includes structural subtyping. The subtyping is derived from the user-de�ned annotation ordering.

The paper includes another annotation type, variance annotations, that indicatefor every �eld, whether it is only read, only written, or both. This thesis assumesthat most of the time the �elds are both, read and written, and therefore omits thisadditional annotation. This decision has a great impact on subtyping: Whenevera �eld and therefore its annotations are read and written, we have to treat itinvariantly [36, Chapter 15.5]. This again leads to a very restricted subtyping,which is why we omitted it in this thesis, so far. The reason why we include thisrestricted subtyping here is, that we need it for the upcoming Section on inheritance.

Figure 6.1 presents the standard subsumption rule and the invariant subtyping,which only allows raise of the summary value annotation.

131

Page 144: Java(X) A Type-Based Program Analysis Framework

6 Extensions

Figure 6.1 Subsumption rule for Java(X).

TSub

P ;A e e : t1 � A1

t t1 ≤ t2 A A1 ≤ A2

P ;A e e : t2 � A2

va ≤ va ′

t 〈va, c{fi : si}〉 ≤ 〈va ′, c{fi : si}〉

6.2 Polymorphism

The extension to polymorphism essentially adds annotation variables to the typelanguage and allows constrained abstraction over them. The splitting, droppability,and subtyping relations become constraints, which can be abstracted over. In fact,the addition of polymorphism to a monomorphic type-based program analysis is aschematic, but tedious e�ort. This extension is modeled according to the HM(X)framework [34] which provides a parametrized extension of Hindley-Milner typing(including type inference) by suitable constraint theories and subtyping.The resulting constrained polymorphism adds technical complication, but it greatly

increases the expressiveness. As an example, we revisit the typing of the detach()method of the JDOM API. In Section 2.2, we had to decide on one particular usagepattern for detach(). Either the typing made the method return the active refe-rence or it modi�ed the active receiver object. With annotation polymorphism, thesystem can postpone the decision by abstracting over the annotations and makingthe required splitting into a constraint. Here is the resulting type abstracting overthe activity annotation variables ψ′ and ψ′′:

∀ψ′ψ′′. M(N) � ψ′ | ψ′′ ⇒Attribute{p : 〈ψ′′, Element〉}

[Attribute{p : 〈M(ND) ψ′, Element〉}] detach()

The splitting constraint M(N) � ψ′ | ψ′′ �xes the relationship between ψ′ andψ′′. The two type signatures for detach() suggested in Section 2.2 are the onlyinstances of the above parametrized type.

6.3 Inheritance

Inheritance and interfaces can be treated with a minor�but important�extensionas in RAJA [27]. In CoreJava(X), the type of an object includes only the des-criptions of the �elds belonging to the object's class. For Java(X), with subtypingand a cast operation, the type of an object includes descriptions of all �elds of allclasses and a cast changes the class type but leaves the �eld environment untouched.

132

Page 145: Java(X) A Type-Based Program Analysis Framework

6.3 Inheritance

Figure 6.2 Type cast rule for Java(X).

P ;A e e : 〈va, c′{f : s}〉 � A′

P ;A e (c)e : 〈va, c{f : s}〉 � A′

Figure 6.2 contains the rule for a cast; the subtyping rule in Figure 6.1 changes ina way that it can also raise the class type. Interface types can be treated in thesame way. Their addition just a�ects Java's subtyping relation.The expanded class type is required for type checking cast operations in a mean-

ingful way. Suppose that class A is a subclass of class B:

class B {}

class A extends B {

Object mystate;

public A (Object state) {...}

}

and the following use of an A object:

B b = new A (init);

A a = (A) b;

Suppose the newly created A object has type A{mystate : 〈M(init), . . . 〉}. If eachclass type only had the �elds of its own class, then the subsumption to B wouldstrip away the information about the mystate �eld. This information would be lostforever and the subsequent upcast back to A would have to invent some informationabout mystate.With our choice, a cast or subsumption never changes the �eld map but only

changes the static class name associated with it. Thus, information is neither lostnor reinvented.Another issue is method consistency. If a subclass overrides a method of a su-

perclass, then the annotated type in the superclass must subsume the one in thesubclass as in method specialization [31].

133

Page 146: Java(X) A Type-Based Program Analysis Framework
Page 147: Java(X) A Type-Based Program Analysis Framework

7Related Work

This chapter compares the related work with the features of Java(X). There existsa considerable amount of literature related to Java(X) and di�erent aspects of typesystems. The present chapter only provides a brief overview on the closest workconcerning typestate, re�nement types, and ownership types. We focus on the mostrelevant and recent work.

One important source of inspiration for the present work comes from annotatedtype systems [42] and its connection with the HM(X) typing framework [34]. HM(X)was build as a generic framework for extending the successful Hindley-Milner typingframework, including polymorphic types as we touch them in Section 6.2. Similarto Java(X), HM(X) has a generic soundness proof [40] with extension points foruser-de�ned properties.

The �rst ideas related to re�nement types have been published by Freeman andPfenning [23] as an extension to ML with union and intersection types. Theirapproach attaches a property lattice to each type as we do, but they do not providea special handling for undroppable resources. Their ideas have been further re�nedin various directions. For example, indexed types, a weak version of dependenttypes, can express and preserve invariants of data types [50]. This direction haslead to an interesting generalization of recursive data types [49], which is widelyexplored by the community in theoretical and practical terms. Type state checking[44, 43] is a precursor of re�nement typing using similar techniques but for a morerestricted �rst-order imperative language.

Another direction is the development of a logical system to model properties ontop of the type system, as in the work of Mandelbaum et al. [30]. They assemblea fragment of intuitionistic linear logic on top of the ML type system adapted foruse with the monadic metalanguage. While this approach is highly expressive, itrequires a lot of program modi�cations. Our work encodes the logical propertiesin annotations. In addition, the automatic analysis by the activity annotations ofJava(X) takes a lot of burden from the user. Our encoding approach appears to

135

Page 148: Java(X) A Type-Based Program Analysis Framework

7 Related Work

be more lightweight and easier to automate.A research direction that is also closely related deals with type quali�cations. A

type quali�er adds extra information to the type it applies to. Hence it acts likea type annotation in our sense. A representative work on type quali�cations isthe paper by Foster et al. [22] which enables the �ow-sensitive checking of atomicproperties that re�ne standard types. They present an e�cient inference algorithmfor their system. The goal of their work is similar to ours, however, our workcombines �ow-sensitive and �ow-insensitive aspects, and it is geared towards Java.Semantic type quali�ers [12] share some parallels with our work. They allow

the speci�cation of a type quali�er together with a logical formula de�ning itsmeaning in terms of the program state. They automatically discharge the resultingproof obligation and thus obtain a correct system automatically. However, theirproperties only correspond to our value annotations and they do not support thenotion of strong update [11].A number of approaches solve speci�c problems with ad-hoc constructed type

systems and may be viewed as specializations of one of the above frameworks, inparticular exploiting �ow-sensitivity. Examples are the work on atomicity and racedetection [17, 19], the work on Vault [15], and many others.JavaCOP [1] is a tool for implementing certain annotated type systems for Java.

It provides a language for de�ning predicates on typed abstract syntax trees forJava. JavaCOP is integrated with a Java compiler that checks the de�ned pre-dicates before generating code. While JavaCOP provides a �exible and convenientframework for implementing such systems, it is a purely syntactic tool: it neitherprovides any soundness guarantees nor does it have a notion of �ow dependencywhich would be necessary to track state changes. Java(X) provides both. BothJavaCOP and Java(X) make the important distinction between analysis designers(who de�ne predicates and design the annotation set) and programmers (who workin terms of annotations).The Fugue system [16] is the �rst system that implements typestate assertions

and checking for C]. It structures the state of an object in di�erent frames corre-sponding to the nesting of subclasses. For each frame, the programmer can stateformulae in �rst-order predicate logic. In comparison to Java(X), Fugue has adi�erent approach for handling aliases. Fugue introduces extra program constructsto expose the typestate, it requires the programmer to write formula instead ofprede�ned abstract values, and there is no soundness proof. The objects in Fugueare annotated with not aliased or maybe aliased. Only objects with not aliased

annotations can change their state, as soon as they change to maybe aliased, thecurrent typestate is �xed.Another related system is the Hob system [28]. It basically allows the speci�cation

of pre- and postconditions using an abstract speci�cation language based on sets.However, the underlying interpretation of this language is con�gurable to di�erent

136

Page 149: Java(X) A Type-Based Program Analysis Framework

logical systems and there is an aspect-oriented mechanism to simplify authoringof speci�cations. Java(X) manages abstraction the other way round. An analysisdesigner carves out domain-speci�c abstract values from predicates, thus hidingsome complexity from the programmer.More speci�c to Java is the work on ESC/Java [18], which aims for providing

guarantees for Java programs through the use of veri�cation tools. ESC/Java canspecify program properties using logical assertions and relies on a theorem prover.Of course, ESC/Java has more powerful and general ways of specifying programproperties, but we believe there is also a need for tools that perform lightweightand automatic checking of properties.E�ect systems have been developed for Java with the goal of abstracting over

trace e�ects [41, 39]. These systems share with ours some similarity in the form ofthe method types and in the use of polymorphism, but otherwise the intent is quitedi�erent. Our system expresses e�ects implicitly by changing the annotation at amethod call site and attaches information to types in the form of �ow-sensitive and-insensitive annotations.Schultz's type-based binding-time analysis for a Java core language [38] can also

be expressed in Java(X). It is based on the value annotation set TA = P({s})where the presence of s indicates a static value. However, the soundness of abinding time analysis is proved with respect to a non-standard semantics, so ourgeneric soundness result is not directly applicable.The goal of an ownership type system is to improve modularity by partitioning

the state of a system in a hierarchical manner. Such a system restricts inter-object accesses to those that are compatible with the hierarchy. Although Java(X)was not conceived with ownership in mind, it turns out that notions like uniqueand borrowed references are closely related to our notion of active and inactivereferences.There is a lot of work on ownership types and related notions [13, 33, 5, 51] but

we focus only on the most closely related work by Boyland and others. In a seriesof articles culminating in 2005 [9], Boyland and others have established a notionof permissions which are attached to an object type along with an e�ect system toabstract the state dependencies of a method call. The permissions govern whether areference is readable or writable. In earlier work, Boyland [6] has proposed splittingof permissions in fractions where only the full permission �1� allows full read/writeaccess and proper fractions only allow read access. In subsequent work [8], thesefractional permissions are further discussed and extended. This kind of permissionare related to our notions of active, inactive, and semi-active. E�ects are alsopresent in our system, albeit in the form of explicit state transitions on the argumentand receiver types of a method.It is also instructive to compare the notions of active, semiactive, and inactive

with similar notions in the realm of ownership type systems as categorized by

137

Page 150: Java(X) A Type-Based Program Analysis Framework

7 Related Work

Boyland [7]. Their categorization includes the permissions R (read), W (write),R (exclusive read), and W (exclusive write: no other alias may write)1. Activecorresponds to RWW (read, write, and exclusive write permission), semiactiveto RW , and inactive to R (transposed to a per-�eld setting; their original workcategorizes variables).

In previous work [46], Thiemann has proposed an annotated type system for aJava subset without inheritance that provides improved types for the DOM inter-face. This work has inspired the example in Section 2.2 but it is limited in severalaspects: It is tied to one particular amalgamation of annotation and activity, itcan only keep track of one state of a resource (either the resource is in this stateor nothing is known about it) and there is no notion of droppability. It does notsupport type state change, and it does not treat inheritance.

In a series of work, Hofmann et al. [26, 25, 24] use an automatic type-basedamortized analysis to statically determine upper bounds on the resource consump-tion of �rst-order functional programs. The latest version can check polynomialresource bounds and is no longer limited to linear ones. The analysis additionallyworks with non-terminating programs, is capable of being instantiated to checkheap space, stack space, or clock cycle usage.

Closer related to Java(X), Hofmann and Jost [27] have also de�ned a type-basedanalysis to predict the consumption of heap space by Java methods. Their systemRAJA is inspired by amortized complexity analysis. The underlying design ideas oftheir type system are similar to ours, however, the details are di�erent and our workhas been developed independently. For example, splitting works very di�erently andJava(X)'s annotations of arguments may change through method calls whereasRAJA's annotations are simply used up because they denote a potential passed toa method invocation through the parameters.

Bierho� and Aldrich proposed access permissions [3] to verify the absence of pro-tocol violations in a Java-like language. They require a developer to annotate thecode to state the intended protocol, too. But, while Java(X) is capable of infer-ring the alias tracking, their system requires the developer to include the accesspermissions, which are closely related to the above described fractions [6], in theannotations. Our annotations appear to be more lightweight and necessitates lessknowledge from the user about the inner details of the system to expediently use it,especially in Java(X) the developer does not need to specify the splitting himself.Their access permission full corresponds to M in Java(X), pure to O, and share to♦. Additional permissions like immutable and unique, with the obvious semantic,can easily be added to Java(X), too. In subsequent work [2, 4] they extend thesystem to work for multi-threaded programs and provide Plural, a prototype imple-mentation, include the analysis of several examples. Still, they do not provide any

1Their remaining permissions O, I, and I are not important here.

138

Page 151: Java(X) A Type-Based Program Analysis Framework

possibility to state droppability and therefore prevent loss of precious references.

139

Page 152: Java(X) A Type-Based Program Analysis Framework
Page 153: Java(X) A Type-Based Program Analysis Framework

8Conclusion

Java(X) is a framework for type re�nement that attaches to the type system ofJava 1.4. It adds annotations to the standard Java-type to enable a �ow-sensitivetracking of resources. The �exible framework may be initialized with an user-de�nedannotation set X to track di�erent resources and provide a possibility to performprotocol checking. Java(X) adds additional activity annotations which may grantexclusive write access to resources and thus facilitates tracking of typestate changes.

Chapter Overview This last Chapter summarizes the thesis and provides a shortoverview on the presented features and possibilities of Java(X). Finally, we con-clude with some directions for future work and potential extensions.

8.1 Summary

The introduction of this thesis provides a general overview and the goals of Java(X).It also states the contributions made by this work. The mathematical backgroundfocuses, next to the introduction of some standard notations, on the principles ofinductive and coinductive proofs. While inductive proofs are widely used and there-fore well-known, the counterpart, coinductive proofs, are far less familiar. Still, byexactly specializing the coinductive proof principle for every property, we managedto present the coinductive proofs in an easy and accessible way. The key to thesimpli�cation is to specialize the property one wants to prove directly to the coin-ductive structure it is used for. This leads to more de�nitions, as, for example,there is a specialized version of transitivity for every relation that we want to provethat transitivity holds. Still, the reward for this e�ort are simple and intuitivecoinductive proofs.We further motivate and introduce the features of Java(X) and its use with

a simple �le handler example. The step-by-step rollout of the features providesan overview on how the di�erent annotations work together. While Java(X) is

141

Page 154: Java(X) A Type-Based Program Analysis Framework

8 Conclusion

capable of performing di�erent and complex analysis, we focus on simple examplesto introduce the design.

The formal system of Java(X), as we present it here, covers a core of Java.CoreJava(X) omits some details of Java to keep the formalization manageableand to avoid cluttering of the proofs. The formalization provides the syntax, staticand dynamic semantics. The next step is to prove that the presented system issound. We �rst showed several basic properties for the system and de�ned rela-tions, such as commutativity and associativity of the splitting relation, transitivityof the subactive relation, invariants for join-free expressions, and stability of thedomain and environment. While these and further properties already provide somecon�dence, we put them together in a complete soundness proof to show that Core-Java(X) enjoys all claimed properties and provides the safety needed for reliablesoftware.

Next, the thesis presents a constraint system for a type checker and modularinference of the optimal splitting of capabilities. The constraint system needs themethod signature provided by the user as it can not infer the intended protocolfor the objects. Still, to take the burden of annotating every splitting point ofvariables from the user, the constraint systems infers the optimal splitting to typethe program. Despite the two possibilities to split an active reference, the algorithmis able to infer the splitting without backtracking. We achieve this e�cient inferenceby introducing an additional activity annotation which defers the decision on whichsplitting rule is used until one of the two new references injects the need of an activeor inactive annotation. The constraint solving algorithm then tracks the splittingpartner and assigns it the remaining capability.

The extension of CoreJava(X) with polymorphism and inheritance takes thestep towards the full language Java 1.4. Finally, the thesis discusses some workthat is closely related to the design and ideas of Java(X).

8.2 Future Work

There are some directions open for future work and extensions of Java(X). ThisSection discusses some of the suggestions for further investigation.

• One important task is to provide a complete implementation for Java(X). Theexamples in Chapter 2 already suggest several syntax simpli�cations. Someof the syntax can be switched to be optional. For example, the summaryvalue may be useful in several settings, but many programs and propertiescan be modeled without it. The same holds for the location for di�erentobject generation places, attached to every new-expression.

The design of Java(X) already avoids annotations inside the code and nar-

142

Page 155: Java(X) A Type-Based Program Analysis Framework

8.2 Future Work

rows all annotations inside the code to the method signatures. In addition, theframework is purely static, the dynamic behavior of programs is unchanged bythe annotations. This raises the question, whether the framework can be im-plemented as an extension or plug-in to existing programming environments,like eclipse1. That way, Java(X) can provide all features presented in thisthesis without touching the Java run-time system.

As soon as such an implementation is available, the next step is to incremen-tally annotate real world programs to analyze di�erent properties.

• Chapter 7 already mentioned that there may be more activity annotations.A possible extension of Java(X) may adopt some additional activity anno-tations, for example immutable to provide a �xed, but exact state. Such anannotation may be used to state whether a resource is fully initialized andready for use. Splitting adds a possibility to split an active reference into twoimmutable ones, provided that the droppability does not interfere.

Other variants of activity annotations are conceivable, too. For example somekind of super active that acts only locally and facilitate circular data struc-tures inside of a restricted area with some means to perform an update on allparticipating elements.

• So far, annotations in Java(X) are purely syntactic. Another direction forfuture work is to provide the annotations with some semantic to automaticallyenforce properties.

• Finally, Java further developed in recent years. To keep up with the newfeatures of Java, it is desirable to analyze which of the additional features ofJava 1.6 can be built into Java(X).

1www.eclipse.org

143

Page 156: Java(X) A Type-Based Program Analysis Framework
Page 157: Java(X) A Type-Based Program Analysis Framework

Bibliography

[1] Chris Andreae, James Noble, Shane Markstrum, and Todd Millstein. A frame-work for implementing pluggable type systems. In Proc. 21th ACM Conf.

OOPSLA, pages 57�74, Portland, OR, USA, 2006. ACM Press, New York.

[2] Nels E. Beckman, Kevin Bierho�, and Jonathan Aldrich. Verifying correctusage of atomic blocks and typestate. In Proc. 23rd ACM Conf. OOPSLA,pages 227�244, Nashville, TN, USA, 2008. ACM Press, New York.

[3] Kevin Bierho� and Jonathan Aldrich. Modular typestate checking of aliasedobjects. In Proc. 22nd ACM Conf. OOPSLA, pages 301�320, Montreal, QC,CA, 2007. ACM Press, New York.

[4] Kevin Bierho�, Nels E. Beckman, and Jonathan Aldrich. Practical API pro-tocol checking with access permissions. In Proc. 23th ECOOP, volume 5653 ofLNCS, pages 195�219, Genova, Italy, July 2009. Springer.

[5] Chandrasekhar Boyapati, Barbara Liskov, and Liuba Shrira. Ownership typesfor object encapsulation. In Proc. 30th ACM Symp. POPL, pages 213�223,New Orleans, LA, USA, January 2003. ACM Press.

[6] John Boyland. Checking interference with fractional permissions. In Proc. Intl.

Static Analysis Symposium, SAS'03, volume 2694 of LNCS, pages 55�72, SanDiego, CA, USA, June 2003. Springer.

[7] John Boyland, James Noble, and William Retert. Capabilities for sharing:A generalisation of uniqueness and read-only. In Proc. 15th ECOOP, volume2072 of LNCS, pages 2�27, Budapest, Hungary, June 2001. Springer.

[8] John Boyland, William Retert, and Yang Zhao. Comprehending annotations onobject-oriented programs using fractional permissions. In International Work-

shop on Aliasing, Con�nement and Ownership in Object-Oriented Program-

ming, pages 4:1�4:11, New York, NY, USA, 2009. ACM.

145

Page 158: Java(X) A Type-Based Program Analysis Framework

Bibliography

[9] John Tang Boyland and William Retert. Connecting e�ects and uniquenesswith adoption. In Proc. 32nd ACM Symp. POPL, pages 283�295, Long Beach,CA, USA, January 2005. ACM Press.

[10] Gilad Bracha. Pluggable Type Systems. OOPSLA Workshop on Revival of

Dynamic Languages, October 2004.

[11] David R. Chase, Mark Wegman, and F. Kenneth Zadeck. Analysis of pointersand structures. In Proc. 1990 ACM Conf. PLDI, pages 296�310, White Plains,NY, USA, June 1990. ACM.

[12] Brian Chin, Shane Markstrum, and Todd Millstein. Semantic type quali�ers.In Proc. 2005 ACM Conf. PLDI, pages 85�95, Chicago, IL, USA, June 2005.ACM Press.

[13] David G. Clarke, John M. Potter, and James Noble. Ownership types for�exible alias protection. In Proc. 13th ACM Conf. OOPSLA, pages 48�64,Vancouver, BC, Canada, October 1998. ACM Press, New York.

[14] Markus Degen, Peter Thiemann, and Stefan Wehr. Tracking linear and a�neresources with Java(X). In Proc. 21st ECOOP, volume 4609 of LNCS, pages550�574, Berlin, Germany, July 2007. Springer.

[15] Robert DeLine and Manuel Fähndrich. Enforcing high-level protocols in low-level software. In Proc. 2001 ACM Conf. PLDI, pages 59�69, Snowbird, UT,USA, June 2001. ACM Press, New York, USA.

[16] Robert DeLine and Manuel Fähndrich. Typestates for objects. In Proc. 18th

ECOOP, volume 3086 of LNCS, pages 465�490, Oslo, Norway, June 2004.Springer.

[17] Cormac Flanagan and Stephen N. Freund. Type-based race detection for Java.In Proc. 2000 ACM Conf. PLDI, pages 219�232, Vancouver, BC, Canada, June2000. ACM Press. Volume 35(5) of SIGPLAN Notices.

[18] Cormac Flanagan, K. Rustan M. Leino, Mark Lillibridge, Greg Nelson,James B. Saxe, and Raymie Stata. Extended static checking for Java. InProc. 2002 ACM Conf. PLDI, pages 234�245, Berlin, Germany, June 2002.ACM Press.

[19] Cormac Flanagan and Shaz Qadeer. A type and e�ect system for atomicity.In Proc. 2003 ACM Conf. PLDI, pages 338�349, San Diego, California, USA,May 2003. ACM Press.

146

Page 159: Java(X) A Type-Based Program Analysis Framework

Bibliography

[20] Cormac Flanagan, Amr Sabry, Bruce F. Duba, and Matthias Felleisen. Theessence of compiling with continuations. In Proc. 1993 ACM Conf. PLDI,pages 237�247, Albuquerque, NM, USA, June 1993.

[21] Matthew Flatt, Shriram Krishnamurthi, and Matthias Felleisen. A program-mer's reduction semantics for classes and mixins. In Formal Syntax and Se-

mantics of Java, number 1523 in LNCS, pages 241�269. Springer, 1999.

[22] Je�rey S. Foster, Tachio Terauchi, and Alex Aiken. Flow-sensitive type qual-i�ers. In Proc. 2002 ACM Conf. PLDI, pages 1�12, Berlin, Germany, June2002. ACM Press.

[23] Tim Freeman and Frank Pfenning. Re�nement types for ML. In Proc. 1991

ACM Conf. PLDI, pages 268�277, Toronto, Canada, June 1991. ACM.

[24] Jan Ho�mann and Martin Hofmann. Amortized Resource Analysis with Poly-morphic Recursion and Partial Big-Step Operational Semantics. In 8th Asian

Symp. on Prog. Langs. (APLAS'10), volume 6461 of Lecture Notes in Com-

puter Science, pages 172�187. Springer, 2010.

[25] Jan Ho�mann and Martin Hofmann. Amortized Resource Analysis with Poly-nomial Potential - A Static Inference of Polynomial Bounds for FunctionalPrograms. In Proc. of the 19th ESOP, volume 6012 of LNCS, pages 287�306.Springer, 2010.

[26] Martin Hofmann and Ste�en Jost. Static prediction of heap space usage for�rst-order functional programs. In Proc. 30th ACM Symp. POPL, pages 185�197, New Orleans, LA, USA, January 2003. ACM Press.

[27] Martin Hofmann and Ste�en Jost. Type-based amortised heap-space analysis.In Proc. 15th ESOP, volume 3924 of LNCS, Vienna, Austria, March 2006.Springer.

[28] Patrick Lam, Viktor Kuncak, and Martin Rinard. Crosscutting techniquesin program speci�cation and analysis. In Proc. 4th AOSD, pages 169�180,Chicago, Illinois, March 2005. ACM Press, New York.

[29] Xavier Leroy and Hervé Grall. Coinductive big-step operational semantics.Information and Computation, 207(2):284�304, 2009.

[30] Yitzhak Mandelbaum, David Walker, and Robert Harper. An e�ective theoryof type re�nements. In Proc. ICFP 2003, pages 213�225, Uppsala, Sweden,August 2003. ACM Press, New York.

147

Page 160: Java(X) A Type-Based Program Analysis Framework

Bibliography

[31] John C. Mitchell. Toward a typed foundation for method specialization andinheritance. In Proc. 17th ACM Symp. POPL, pages 109�124, San Francisco,CA, USA, January 1990. ACM Press.

[32] Flemming Nielson, Hanne Riis Nielson, and Chris Hankin. Principles of Pro-gram Analysis. Springer Verlag, 1999.

[33] James Noble, Jan Vitek, and John Potter. Flexible alias protection. InECOOP, volume 1445 of LNCS, pages 158�185, Brussels, Belgium, July 1998.Springer.

[34] Martin Odersky, Martin Sulzmann, and Martin Wehr. Type inference withconstrained types. Theory and Practice of Object Systems, 5(1):35�55, 1999.

[35] Jens Palsberg. Type-based analysis and applications. In ACM, editor, ACMSIGPLAN � SIGSOFT Workshop on Program Analysis for Software Tools and

Engineering: PASTE'01, pages 20�27, New York, NY, June 09 2001. ACMPress.

[36] Benjamin C. Pierce. Types and Programming Languages. MIT Press, 2002.

[37] Michael Schre� and Markus Stumptner. Behavior-consistent specialization ofobject life cycles. ACM Trans. Software Eng. and Meth., 11(1):92�148, 2002.

[38] Ulrik P. Schultz. Partial evaluation for class-based object-oriented languages.In Programs as Data Objects (PADO'01), Aarhus, Denmark, number 2053 inLNCS, pages 173�197, 2001.

[39] Christian Skalka. Trace e�ects and object orientation. In Proceedings of the

ACM Conference on Principles and Practice of Declarative Programming, Lis-bon, Portugal, July 2005.

[40] Christian Skalka and François Pottier. Syntactic type soundness for HM(X).In Proceedings of the 2002 Workshop on Types in Programming (TIP'02), vol-ume 75 of Electronic Notes in Theoretical Computer Science, Dagstuhl, Ger-many, July 2002.

[41] Christian Skalka, Scott Smith, and David Van Horn. A type and e�ect sys-tem for �exible abstract interpretation of Java. In Proceedings of the ACM

Workshop on Abstract Interpretation of Object Oriented Languages, ElectronicNotes in Theoretical Computer Science, Paris, France, January 2005.

[42] Kirsten Lackner Solberg. Annotated Type Systems for Program Analysis. PhDthesis, Odense University, Denmark, July 1995. Also technical report DAIMIPB-498, Comp. Sci. Dept. Aarhus University.

148

Page 161: Java(X) A Type-Based Program Analysis Framework

Bibliography

[43] Robert E. Strom and Daniel M. Yellin. Extending typestate checking usingconditional liveness analysis. IEEE Trans. Softw. Eng., 19(5):478�485, 1993.

[44] Robert E. Strom and Shaula Yemini. Typestate: A programming languageconcept for enhancing software reliability. IEEE Trans. Softw. Eng., 12(1):157�171, 1986.

[45] Alfred Tarski. A lattice-theoretical �xpoint theorem and its applications.5(2):285�309, 1955.

[46] Peter Thiemann. A type safe DOM API. In Proc. 10th International Sympo-

sium on Database Programming Languages (DBPL'05), volume 3774 of LNCS,pages 169�183, Trondheim, Norway, August 2005. Springer.

[47] David Walker. Substructural type systems. In Benjamin C. Pierce, editor,Advanced Topics in Types and Programming Languages, chapter 1. MIT Press,2005.

[48] Andrew Wright and Matthias Felleisen. A syntactic approach to type sound-ness. Information and Computation, 115(1):38�94, 1994.

[49] Hongwei Xi, Chiyan Chen, and Gang Chen. Guarded recursive datatype con-structors. In Proc. 30th ACM Symp. POPL, pages 224�235, New Orleans, LA,USA, January 2003. ACM Press.

[50] Hongwei Xi and Frank Pfenning. Dependent types in practical programming.In Proc. 26th ACM Symp. POPL, pages 214�227, San Antonio, Texas, USA,January 1999. ACM Press.

[51] Tian Zhao, Jens Palsberg, and Jan Vitek. Lightweight con�nement for Feath-erweight Java. In Proc. 18th ACM Conf. OOPSLA, pages 135�148, Anaheim,CA, USA, 2003. ACM Press, New York.

149

Page 162: Java(X) A Type-Based Program Analysis Framework
Page 163: Java(X) A Type-Based Program Analysis Framework

Index

(∀i), 44L, 85M s, 33M, 26`, 116" � , 111false, 109`, 6↔ 116≺, 109· =? · |f ·, 109· := · ↓C · ·, 109ρs, 109ρt, 109l, 109M `, 109♦ `, 109null `, 109· � · |? ·, 109<:, 109· Cw f ← · � ·, 109ρ, 31⇀fi, 84

null, 38O/♦ s, 33[y/x]e, 45�dom(A), 43-, 85♦ s, 33♦, 26· � · | ·, 32�, 35

ρ�, 47 , 27

◦(), 108aa, 26↓sfj

, 84

mathcal A, 84w, 86

fv(e), 44c, 86

l, 86

e, 38

S , 86

O, 26�A, 84

va, 26aliases-ok , 84JF, 47

null-free-join, 48

DP, 49

EA, 49

EQ, 50

NA, 49

SA, 49

SD, 49

SP, 49

access path, 84

activity annotation, 26

compatible types, 44

constraint syntax, 109

151

Page 164: Java(X) A Type-Based Program Analysis Framework

Index

droppability, 31dynamic semantics, 29

e�ect application, 35extended type syntax, 108

free variables, 44fully active, 33

lookup functions, 24

rulesCSAcc, 112CSCall, 112CSCond, 112CSLetExp, 112CSLetVar, 112CSMultiVal, 112CSNew, 112CSNull, 112CSSet, 112CSVar, 112DropAct, 31DropInact, 31DropSemiact, 31DropType, 31EAActive, 35EANonactive, 35EANull, 35EASType, 35EAType, 35EAVar, 35NullType, 38RAccNull, 30RAcc, 29RCall, 29RCondLoc, 29RCondNull, 29RJoinExp, 29RJoinLoc, 29RLetExp, 29RLetLoc, 29

RNew, 29RNullCall, 30RSetNull, 30RSet, 29SDActDrop, 47SDAct, 47SDEmpty, 47SDEnv, 47SDInact, 47SDSemiact, 47SDType, 47SplitActFst, 32SplitActSnd, 32SplitInact, 32SplitSType, 32SplitSemiact, 32SplitType, 32SubactAct, 35SubactEmptyEnv, 35SubactEnv, 35SubactInact, 35SubactSType, 35SubactSemiact, 35SubactType, 35TAccNull, 40TAcc, 39TCall, 39TCond, 39TJoinNull, 40TJoin, 40TLetExp, 39TLetVar, 39TMultiVal, 38TNew, 39TNull, 39TSetNull, 40TSet, 39TSub, 132TVar, 39TypeAccess, 37TypeFullAct, 33

152

Page 165: Java(X) A Type-Based Program Analysis Framework

Index

TypeFullNonact, 33TypeFullSemiact, 33TypeUpdateAct, 37TypeUpdateSemiact, 37WFAct, 33WFInact, 33WFSemiact, 33WFType, 33

splitting, 32subactive, 35subdroppability, 47syntax, 24

type access, 37type change, 27type encoding, 111type syntax, 24type update, 37typing rules, 39

Urtype assumption, 84

value annotation, 26variable access, 116

well-formedness, 33

153