context-sensitive pointer analysis a dissertationduction and describes why i believe program...

CONTEXT-SENSITIVE POINTER ANALYSIS

USING BINARY DECISION DIAGRAMS

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

John Whaley

March 2007

c© Copyright by John Whaley 2007

All Rights Reserved

ii

I certify that I have read this dissertation and that, in my opinion, it

is fully adequate in scope and quality as a dissertation for the degree

of Doctor of Philosophy.

(Monica S. Lam) Principal Adviser




(David L. Dill)




(Dawson Engler)

Approved for the University Committee on Graduate Studies.

iii

Abstract

This thesis shows that whole-program context-sensitive inclusion-based pointer anal-

ysis, a previously intractable problem, can be efficiently solved using binary decision

diagrams. In addition, we show that it is possible to automatically translate from a

high-level analysis specification written in Datalog into an efficient implementation

using binary decision diagrams.

We present the first scalable context-sensitive, inclusion-based pointer analysis for

Java programs. Our approach to context sensitivity is to create a clone of a method for

every context of interest, and run a context-insensitive algorithm over the expanded

call graph to get context-sensitive results. For precision, we generate a clone for every

acyclic path through a program’s call graph, treating methods in a strongly connected

component as a single node. Normally, this formulation is hopelessly intractable as

a call graph often has 1014 acyclic paths or more. We show that these exponential

relations can be computed efficiently using BDDs.

We also describe bddbddb, a BDD-Based Deductive DataBase, which implements

the declarative language Datalog with stratified negation, totally-ordered finite do-

mains and comparison operators. bddbddb uses binary decision diagrams (BDDs) to

efficiently represent large relations. BDD operations take time proportional to the

size of the data structure, not the number of tuples in a relation, which leads to

fast execution times. bddbddb is an effective tool for implementing a large class of

program analyses. We show that a context-insensitive pointer analysis implemented

with bddbddb is about twice as fast as a carefully hand-tuned version.

iv

Preface

When I’m working on a problem, I never think about beauty. I think only

how to solve the problem. But when I have finished, if the solution is not

beautiful, I know it is wrong.

R. Buckminster Fuller (1895–1983)

Shortly after arriving at Stanford, my adviser Monica Lam challenged me to

“solve” pointer analysis. At that time, the only pointer analysis that could success-

fully scale to large programs was an imprecise equivalence-based pointer analysis. I

played around with various techniques to improve performance and scalability, achiev-

ing some moderate success but a breakthrough remained elusive. The best I was able

to achieve was selective context sensitivity, where a client analysis could selectively

invoke context sensitivity to improve precision on parts of the program where it would

make a difference. Interesting, but not totally satisfying, and the algorithm was so

difficult few people could understand it.

In 2003, I read a paper by Berndl et al. about implementing a context-insensitive

pointer analysis using binary decision diagrams. What made the paper interesting was

not the experimental results — the BDD implementation was actually slower than a

more traditional implementation — but the fact that the algorithm formulation and

implementation was greatly simplified by using BDDs.

I implemented Berndl’s algorithm and began to experiment with it. I soon realized

the key to its performance was in its efficient data representation. At that point, I

realized I had been on the wrong path. I had been trying to come up with more

and more complicated algorithms to solve the pointer analysis problem. Instead, I

v

should have focused on how to represent the points-to data efficiently — an efficient

algorithm would follow from the data representation.

As my experience with BDDs grew, I began to think they could be used to imple-

ment context sensitivity efficiently. The common substructure, efficient computation

on large sets, and memoization features of BDDs all seemed to be good matches

for context-sensitive analysis. My first attempts were unsuccessful, but I persevered

because I believed it could be made to work. And after many months of slow and

methodical progress, I finally had a fully context-sensitive inclusion-based pointer

analysis that could analyze very large Java programs on the order of tens of minutes.

Unfortunately, after all of the optimizations and hacks that were required for good

performance, the implementation had little semblance to the high-level algorithm.

The simple elegance of a BDD solution was lost. Furthermore, I had worked on that

one algorithm for nearly a year. There were still many other analyses I wanted to

work on, and I did not relish the idea of spending a year for each one.

At that time, Monica Lam was working on a new edition of the venerated Dragon

book and was struggling with the question of how to present difficult material in an

easy and understandable way. Her desire to make things simpler and more general

for the Dragon book greatly influenced my direction and fundamentally changed my

view of research. I realized that for research to have a significant impact, it should

be easy to understand and applicable to a wide variety of problems.

I started writing a tool that would automatically translate from a high-level algo-

rithm specification into an efficient implementation. Interactions with Thomas Reps

and Jeff Ullman lead me to Datalog, which had a rich body of research behind it and

a semantics that closely matched BDDs. I incorporated all of the tricks I had learned

from my one year experience with using BDDs for program analysis. My intention

was to abstract away the BDD back-end as much as possible, and allow the user to

simply specify a set of inference rules and have it work efficiently.

One of my main goals was that I wanted the system to be usable by others. Many

people were intrigued by the possibility of writing program analyses in just a few

lines of code and being able to automatically translate a specification into an efficient

implementation. Early on, I worked closely with other Stanford students, especially

vi

Chris Unkel, Dzintars Avots, Michael Martin, Ben Livshits, Michael Carbin, Jim

Zhuang, and Mayur Naik, who were using the system to implement their own program

analyses. Their feedback led to great improvements in usability.

The tool’s name, bddbddb, came out of an email discussion with Monica Lam and

Chris Unkel. The name was both silly and memorable, and would certainly show up

first on a Google search. In other words, an excellent name. I still get chuckles from

the audience about the name when I do presentations about bddbddb.

My intention with this dissertation is for it to be the most complete and authorita-

tive reference on the bddbddb system and program analysis using Datalog and BDDs.

As with any system, bddbddb is constantly evolving and any documentation will soon

be hopelessly out-of-date. However, this dissertation will attempt to describe the core

principles and techniques without getting caught up in the particulars. After reading

this dissertation, the reader should have enough knowledge to be able to formulate

their own program analyses in Datalog and build their own BDD-based solver, if they

were so inclined.

I also intended this dissertation to be readable by someone who has some knowl-

edge of compilers and program analysis. Thus, I have tried to include enough back-

ground for someone with the motivation and perseverance to be able to pick up this

dissertation and read it without getting too lost. I don’t fancy myself a writer, but I

have made an effort to make the text and concepts as clear as possible.

Most of the work in this dissertation is my own, with the notable exceptions of

the BDD variable learning algorithm (Section 4.3), which was primarily the work

of Michael Carbin, the reflection algorithm (Section 5.2) which was joint work with

Benjamin Livshits, and the static race detection algorithm (Section 5.3), which was

designed by Mayur Naik. I included these for completeness and as a demonstration

that others have been able to build useful analyses using bddbddb.

I have organized this dissertation into eight chapters. Chapter 1 gives an intro-

duction and describes why I believe program analysis, high-level specification, and

context sensitivity are important areas of research. Chapter 2 gives some background

on the research areas relevant to the rest of the dissertation — namely, program anal-

ysis (2.1), pointer analysis (2.2), Datalog (2.3), binary decision diagrams (2.4), and

vii

context sensitivity (2.5). Chapter 3 describes my approach to pointer analysis and

context sensitivity using BDDs. Chapter 4 describes the bddbddb system in detail.

Chapter 5 describes various applications of the bddbddb system by myself and others.

Chapter 6 presents some experimental results from the system. Chapter 7 compares

my work with related work, and Chapter 8 concludes.

Someone once said, “A thesis is never finished, merely abandoned.” I now realize

the point where one comes to fully understand and appreciate that statement is the

point of enlightenment that allows one to graduate. There are many areas that I

wanted to pursue and investigate more thoroughly, however after time and reflection

I realized it was better to move on and pursue other directions.

When I consider this dissertation as a whole, there was a frightening amount

of work. Even more frightening perhaps is the realization that a large portion of

what I worked on during my Masters and Ph.D. — in the areas of profiling[26, 244],

dynamic compilation[10, 46, 243, 245], pointer and escape analysis[203, 252, 255],

checkpointing[246], component interfaces[254], virtual machines[247, 249], and spec-

ulative parallelization[251] — does not even warrant a mention in this dissertation.

But I’ve realized that a Ph.D. is not about the end result, it is about the journey.

John Whaley

March 2007

viii

Acknowledgments

• First off, I must thank my wife Miho, who gives meaning to everything I do.

• I want to thank my adviser Monica Lam for teaching me so much and pushing

me to excellence. Much of the work in this thesis is directly or indirectly due to

her. And never in my wildest dreams would I have imagined that my research

would end up as Chapter 12 in the venerable Dragon book.

• I would also like to thank Martin Rinard, who was the advisor for my Masters

thesis at MIT. He taught me many valuable lessons about research that I still

use today.

• My Ph.D. orals committee — Juan Alonso, David Dill, Dawson Engler, Monica

Lam, and Jeff Ullman — did a stellar job. I was extremely lucky to have such

a superstar committee and I’ll never forget the experience. I am especially

indebted to my dissertation readers Monica, David, and Dawson, who gave me

very fast turnarounds and great comments.

• I would also like to thank all of my coworkers from IBM Watson and IBM Tokyo

who introduced me to compiler research and gave me many opportunities early

in my career.

• I want to give thanks to all of the Stanford students I’ve worked with over

the years: Dzintars Avots, Godmar Back, David Bloom, Michael Carbin, Brian

Carlstrom, Ramesh Chandra, Benjamin Chelf, Andy Chou, Jim Chow, Ben

D’Angelo, Michael Dalton, Seth Hallem, Sudheendra Hangal, David Heine, Ted

Kremenek, Shih-Wei Liao, Amy Lim, Ben Livshits, Michael Martin, Mayur

ix

Naik, Brian Murphy, Mayur Naik, Jim Norris, Jeff Oplinger, Shankar Pon-

nekanti, Will Robinson, Olatunji Ruwase, Joel Sandin, Costa Sapuntzakis,

Patrick Sathyanathan, Brad Schumitsch, Garrett Smith, Paul Twohey, Chris

Unkel, Yichen Xie, Junfeng Yang, Nickolai Zeldovich, Jim Zhuang, and the

countless others that I forgot to write.

• I especially want to give thanks to my paper collaborators Michael Martin, Ben

Livshits, Dzintars Avots, Michael Carbin, Chris Unkel, Christos Kozyrakis,

Mayur Naik, and Alex Aiken.

• I want to thank my parents for buying that TI/99-4 back in 1980 that started

a lifetime passion with computers and programming.

• I’d also like to thank NSF and Intel for providing financial support for my time

at Stanford.

• Finally, I would like to thank you, the reader. Most Ph.D. theses are never

read after they are signed, and so the fact you are sitting down and reading this

means you probably think there is some value in it. Although I cannot promise

anything, I will do my best not to disappoint you.

x

Contents

Abstract iv

Preface v

Acknowledgments ix

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Challenges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Pointer Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.2.2 Context-Sensitive Program Analysis . . . . . . . . . . . . . . . 5

1.3 Our Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.3.1 Binary Decision Diagrams . . . . . . . . . . . . . . . . . . . . 6

1.3.2 Cloning to Achieve Context Sensitivity . . . . . . . . . . . . . 6

1.3.3 High-level Specification to Efficient Implementation . . . . . . 7

1.4 List of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.5 Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Background 14

2.1 Program Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.1.1 Applications of Program Analysis . . . . . . . . . . . . . . . . 15

2.1.2 Styles of Program Analysis . . . . . . . . . . . . . . . . . . . . 15

2.2 Pointer Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Types of Pointer Analysis . . . . . . . . . . . . . . . . . . . . 17

xi

2.2.2 Call Graph Discovery . . . . . . . . . . . . . . . . . . . . . . . 21

2.3 Datalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3.1 Introduction to Datalog . . . . . . . . . . . . . . . . . . . . . 23

2.3.2 Datalog Semantics . . . . . . . . . . . . . . . . . . . . . . . . 25

2.3.3 Evaluation Strategy . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.4 Magic Sets Transformation . . . . . . . . . . . . . . . . . . . . 28

2.3.5 Other Datalog variants . . . . . . . . . . . . . . . . . . . . . . 28

2.4 Binary Decision Diagrams . . . . . . . . . . . . . . . . . . . . . . . . 29

2.4.1 BDD Variable Ordering . . . . . . . . . . . . . . . . . . . . . 30

2.4.2 Other Variants . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.5 Context Sensitivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.5.1 Definitions of Context . . . . . . . . . . . . . . . . . . . . . . 33

2.5.2 Techniques for Solving Context-sensitive Analyses . . . . . . . 34

3 Pointer Analysis 35

3.1 Context-Insensitive Pointer Analysis . . . . . . . . . . . . . . . . . . 35

3.1.1 Improving Pointer Analysis with Types . . . . . . . . . . . . . 38

3.2 Call Graph Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.3 Context-Sensitive Pointer Analysis . . . . . . . . . . . . . . . . . . . 43

3.3.1 Numbering Call Paths . . . . . . . . . . . . . . . . . . . . . . 44

3.3.2 Context-Sensitive Pointer Analysis with a Precomputed Call

Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.4 Object-Sensitive Pointer Analysis . . . . . . . . . . . . . . . . . . . . 50

4 bddbddb: BDD-Based Deductive Database 53

4.1 From Datalog to BDD Operations . . . . . . . . . . . . . . . . . . . . 54

4.1.1 Relational Algebra . . . . . . . . . . . . . . . . . . . . . . . . 54

4.1.2 Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . 55

4.1.3 BDD Operations . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.2 Translating and Optimizing Datalog Programs . . . . . . . . . . . . . 57

4.2.1 Datalog Source Transformations . . . . . . . . . . . . . . . . . 57

4.2.2 Datalog Rule Optimization . . . . . . . . . . . . . . . . . . . . 58

xii

4.2.3 Intermediate Representation . . . . . . . . . . . . . . . . . . . 61

4.2.4 IR Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . 61

4.2.5 BDD Decision Variable Assignment . . . . . . . . . . . . . . . 63

4.2.6 Additional Optimizations . . . . . . . . . . . . . . . . . . . . 64

4.2.7 Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

4.2.8 Other Considerations . . . . . . . . . . . . . . . . . . . . . . . 65

4.3 Learning BDD Variable Orderings . . . . . . . . . . . . . . . . . . . . 66

4.3.1 The Problem Space . . . . . . . . . . . . . . . . . . . . . . . . 67

4.3.2 Problem Characteristics . . . . . . . . . . . . . . . . . . . . . 68

4.3.3 Integrating Learning into bddbddb . . . . . . . . . . . . . . . . 69

4.3.4 Algorithm Overview . . . . . . . . . . . . . . . . . . . . . . . 71

4.3.5 Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.3.6 Order Constructor . . . . . . . . . . . . . . . . . . . . . . . . 73

4.3.7 Uncertainty Sampler . . . . . . . . . . . . . . . . . . . . . . . 76

4.3.8 Generating a New Order . . . . . . . . . . . . . . . . . . . . . 77

4.4 Querying bddbddb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.4.1 The Come-from Query . . . . . . . . . . . . . . . . . . . . . . 79

5 Applications for Program Analysis 83

5.1 Queries and Other Analyses . . . . . . . . . . . . . . . . . . . . . . . 84

5.1.1 Debugging a Memory Leak . . . . . . . . . . . . . . . . . . . . 84

5.1.2 Heap Connectivity . . . . . . . . . . . . . . . . . . . . . . . . 84

5.1.3 Finding a Security Vulnerability . . . . . . . . . . . . . . . . . 85

5.1.4 Aliased Parameters . . . . . . . . . . . . . . . . . . . . . . . . 86

5.1.5 Type Refinement . . . . . . . . . . . . . . . . . . . . . . . . . 86

5.1.6 Interprocedural Data Flow . . . . . . . . . . . . . . . . . . . . 87

5.1.7 Context-Sensitive Mod-Ref Analysis . . . . . . . . . . . . . . . 88

5.1.8 Context-Sensitive Type Analysis . . . . . . . . . . . . . . . . . 89

5.1.9 Thread Escape Analysis . . . . . . . . . . . . . . . . . . . . . 90

5.2 Pointer Analysis for Reflection . . . . . . . . . . . . . . . . . . . . . . 93

5.2.1 Reflection Resolution Using Casts . . . . . . . . . . . . . . . . 98

xiii

5.3 Static Race Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

5.3.1 Race Detection Algorithm Overview . . . . . . . . . . . . . . 102

5.3.2 Original-Pairs Computation . . . . . . . . . . . . . . . . . . . 102

5.3.3 Reachable-Pairs Computation . . . . . . . . . . . . . . . . . . 104

5.3.4 Aliasing-Pairs Computation . . . . . . . . . . . . . . . . . . . 106

5.3.5 Escaping-Pairs Computation . . . . . . . . . . . . . . . . . . . 106

5.3.6 Unlocked-Pairs Computation . . . . . . . . . . . . . . . . . . . 109

5.3.7 Summary of Static Race Detection Results . . . . . . . . . . . 112

5.4 Program Analyses using bddbddb . . . . . . . . . . . . . . . . . . . . 112

6 Experimental Results 117

6.1 Pointer Analysis Results . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.1.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

6.1.2 Analysis Times . . . . . . . . . . . . . . . . . . . . . . . . . . 120

6.1.3 Evaluation of Results . . . . . . . . . . . . . . . . . . . . . . . 122

6.1.4 Experience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

6.2 Effectiveness of bddbddb Compilation . . . . . . . . . . . . . . . . . . 126

6.2.1 Comparing Lines of Code . . . . . . . . . . . . . . . . . . . . 127

6.2.2 Comparing Analysis Times . . . . . . . . . . . . . . . . . . . . 128

6.2.3 External Lock and SQL Injection Analyses . . . . . . . . . . . 131

6.3 Reflection Resolution Results . . . . . . . . . . . . . . . . . . . . . . 131

6.3.1 Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . 131

6.3.2 Evaluation Approach . . . . . . . . . . . . . . . . . . . . . . . 132

6.3.3 Local Analysis for Reflection Resolution (Local) . . . . . . . 134

6.3.4 Points-to Information for Reflection Resolution (Points-to) . 134

6.3.5 Casts for Reflection Resolution (Casts) . . . . . . . . . . . . 136

6.3.6 Achieving a Sound Call Graph Approximation (Sound) . . . 138

6.3.7 Effect of Reflection Resolution on Call Graph Size . . . . . . . 138

6.4 Results from Machine Learning BDD Variable Orders . . . . . . . . . 140

6.4.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

6.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142

xiv

7 Related Work 144

7.1 Pointer Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

7.1.1 Scalable Pointer Analysis . . . . . . . . . . . . . . . . . . . . . 145

7.1.2 Context-sensitive Pointer Analysis . . . . . . . . . . . . . . . . 146

7.1.3 BDD-based Pointer Analysis . . . . . . . . . . . . . . . . . . . 147

7.2 BDDs for Program Analysis . . . . . . . . . . . . . . . . . . . . . . . 147

7.2.1 BDDs for Model Checking . . . . . . . . . . . . . . . . . . . . 148

7.2.2 BDDs for Predicate Abstraction . . . . . . . . . . . . . . . . . 148

7.2.3 BDDs as a Data Representation . . . . . . . . . . . . . . . . . 148

7.3 High-level Languages for Program Analysis . . . . . . . . . . . . . . . 149

7.3.1 Constraint Languages . . . . . . . . . . . . . . . . . . . . . . . 149

7.3.2 User-specified Program Queries . . . . . . . . . . . . . . . . . 150

7.3.3 Program Analysis with Databases . . . . . . . . . . . . . . . . 150

7.4 Optimizing Datalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

7.4.1 Datalog Implementations . . . . . . . . . . . . . . . . . . . . . 152

7.4.2 Logic programming with BDDs . . . . . . . . . . . . . . . . . 153

7.4.3 Datalog Evaluation Strategies . . . . . . . . . . . . . . . . . . 153

7.5 BDD Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

7.5.1 BDD Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . 154

7.5.2 BDD Variable Ordering . . . . . . . . . . . . . . . . . . . . . 154

7.6 Applications of Program Analysis . . . . . . . . . . . . . . . . . . . . 156

7.6.1 Call Graph Discovery . . . . . . . . . . . . . . . . . . . . . . . 156

7.6.2 Analyzing Reflection . . . . . . . . . . . . . . . . . . . . . . . 158

7.6.3 Finding Program Errors . . . . . . . . . . . . . . . . . . . . . 159

8 Conclusions 161

8.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

8.1.1 Other Data Representations . . . . . . . . . . . . . . . . . . . 162

8.1.2 Beyond Datalog . . . . . . . . . . . . . . . . . . . . . . . . . . 163

8.1.3 Integration in the Software Development Process . . . . . . . 163

Bibliography 164

xv

List of Tables

4.1 The grouping of the data entries in our training set. Each episode is

tied to a particular rule, rule application, and operation within the

rule. Within each episode the order along with its run time is stored. 72

6.1 Descriptions of the benchmarks we used to evaluate our pointer analyses.117

6.2 Information about the benchmarks we used to test our pointer analyses.118

6.3 Analysis times and peak memory usages for each of the benchmarks

and analyses. Time is in seconds and memory is in megabytes. . . . . 121

6.4 Results of escape analysis. . . . . . . . . . . . . . . . . . . . . . . . . 124

6.5 Results of the type refinement query. Numbers are percentages.

Columns labeled multi and refine refer to multi-type variables and

refinable-type variables, respectively. . . . . . . . . . . . . . . . . . . 125

6.6 LOC for hand-coded analyses versus lines of Datalog using bddbddb . 127

6.7 Comparison of context-insensitive Java pointer analysis runtimes.

Times are in seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.8 Comparison of context-sensitive Java pointer analysis runtimes. Times

are in seconds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

6.9 Comparison of C pointer analysis runtimes. Times are in seconds. . . 129

6.10 External lock analysis runtimes. Times are in seconds. . . . . . . . . 130

6.11 SQL injection query results. Times are in seconds. ∞ indicates that

the analysis did not finish. . . . . . . . . . . . . . . . . . . . . . . . . 130

6.12 Summary of information about our benchmarks. Applications are

sorted by the number of lines of code in column 3. . . . . . . . . . . . 132

6.13 Results of resolving Class.forName calls for different analysis versions. 133

xvi

6.14 Number of classes and methods in the call graph for different analysis

versions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

6.15 Information about the analyses that we used to evaluate our BDD order

finding algorithm. The four columns of numbers are the number of

rules, relations, and domains in the input Datalog file, and the number

of possible domain orders. . . . . . . . . . . . . . . . . . . . . . . . . 140

6.16 The results of our learning algorithm. The first four columns of num-

bers compare the speed of a random order, an order generated with a

sifting algorithm, our best hand-tuned order, and the order output by

the algorithm. ∞ means that the analysis did not complete because

it ran out of memory. The next four columns give statistics on the

performance of the learning algorithm. . . . . . . . . . . . . . . . . . 142

6.17 A comparison of the run times of our hand-tuned and generated or-

ders for the j pa, j pacs and j paos analyses on several different input

programs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

xvii

List of Figures

2.1 An example Datalog program with two minimal solutions. . . . . . . 26

2.2 (a) Binary encoding of a relation. (b) and (c) are BDD encodings

of the relation given by (a) with decision variable orders b1, b2, b3 and

b2, b1, b3, respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.1 Example of path numbering. The graph on the left is the original

graph. Nodes M2 and M3 are in a cycle and therefore are placed in

one equivalence class. Each edge is marked with path numbers at the

source and target of the edge. The graph on the right is the graph with

all of the paths expanded. . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2 The six contexts of function M6 in Example 1 . . . . . . . . . . . . . 45

4.1 (a) Predicate dependency graph for Algorithm 1. (b) Breaking the

PDG into SCCs and finding cycles. . . . . . . . . . . . . . . . . . . . 58

4.2 The learning algorithm embedded in Datalog resolution. . . . . . . . 70

4.3 Steps of a learning episode. This corresponds to the contents of the

Active Learner box in Figure 4.2. . . . . . . . . . . . . . . . . . . . . 71

4.4 An example decision tree induced from the training set data in Ta-

ble 4.1 on rule 2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5 Datalog program generated by bddbddb to compute come-from query

“vP(100, 200) : – ?” on Algorithm 1. The second, third, fourth,

and fifth chunks of rules come from Rules (3.2), (3.4), (3.1), and (3.3)

respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

xviii

5.1 Typical use of reflection to create new objects. . . . . . . . . . . . . . 94

5.2 A fragment of a specification file accepted by our system. A string

identifying a call site to Class.forName is mapped to a class name that

that call may resolve to. . . . . . . . . . . . . . . . . . . . . . . . . . 97

5.3 A case in freetts where our analysis is unable to determine the type

of objects instantiated on line 5 using casts. . . . . . . . . . . . . . . 101

6.1 Reflection resolution using points-to results in

javax.xml.transform.FactoryFinder in the JDK. . . . . . . . . . . . . . . 135

xix

Chapter 1

Introduction

1.1 Motivation

Program analysis is the process of analyzing the behavior of a computer program.

Program analysis has many important applications. It has traditionally been used

primarily for optimizing programs so that they will run faster. More recently, program

analysis is increasingly being used in tools to aid software development.

Program analysis has the potential to be extremely useful in finding bugs and

security vulnerabilities in software. Research has shown that static analysis can reduce

defects by up to a factor of six[136], and that 60% of software faults that were found

in released software products could have been detected with static analysis tools[34].

40% of the faults that could be found through static analysis will eventually become

a defect in production software[119].

Software bugs are a serious problem. They cost the U.S. economy an estimated

$59.5 billion per year[178]. While not all errors can be removed, more than a third

of the cost — an estimated $22.2 billion — could be eliminated by an improved

infrastructure that includes static analysis tools[178].

Static analysis brings numerous benefits to the software development process.

Static analysis is attractive as it can catch bugs early in development, before the

software is released. A common rule of thumb in software development is that a

bug that costs $1 to fix before release will cost $100 to fix after release. Thus, it is

1

CHAPTER 1. INTRODUCTION 2

very desirable to catch bugs as early as possible. Static analysis can dramatically

improve test coverage. It can cover difficult corner cases that are tricky to cover

with traditional software testing. A sound static analysis can even prove the absence

of particular bugs. Using static analysis during the development process can speed

development and lead to a higher quality product with fewer defects. It allows bugs

to be caught much sooner than with software testing, and also allows easy checking

after source code modifications.

Static analysis is a powerful tool because it helps automate some aspects of de-

velopment and testing; tasks that in the past were performed by humans. Society’s

demand for software has exceeded the supply of skilled developers. The time spent

tracking down bugs is expensive, both in monetary cost and in opportunity cost.

Ideally, we would like to reuse code that has already been written; however, this is

difficult because for successful reuse, code must not only be solid and free of bugs,

but also well-documented. Program analysis can help document the behavior of com-

ponents even if the specifications or code comments are out of date. Static analysis

provides us a way to spend computing cycles to make humans more efficient, which

is an especially excellent trade-off as computers become faster.

1.2 Challenges

Despite its promise, program analysis has not substantially improved software reli-

ability in the real world. Software is still buggy and insecure, with dozens of new

serious vulnerabilities being found daily[55]. The reasons for this are numerous, in-

cluding legacy code, inertia, lack of programmer education, market pressures, lack of

practical tools, imprecision leading to high false positive rates, and the relative youth

of program analysis for software reliability as a research area.

One of the biggest issues that has limited the number and impact of program

analysis tools is the fact that program analysis is difficult. The algorithms involved

in an accurate program analysis are quite intricate and involved. Most problems in

program analysis are undecidable, which means we need to use approximate algo-

rithms that will work well on real programs. Because practical and accurate static


analysis is such a difficult problem, you have to be an expert to build a static analysis

tool that achieves good results. Many of the experts are in academia and have only

built research prototypes, not practical tools that are used by the general public.1

The inscrutability of modern compilers is also to blame. Users have little or no

visibility into the internals of the compiler. To most users, the compiler is a magic box

where you put in source code and out comes binaries. The compiler has a remarkable

amount of information about the program it is compiling, but little of that information

is available to the user.

Building a practical program analysis is also made difficult by the fact that most

popular programming languages have a complicated specification, or in some cases no

formal specification at all. In the case of C and C++, the specification that exists does

not match the language as it is actually used in real life; many real programs rely on

unspecified behaviors, and commercial compilers do not even completely implement

the standard. Implementing a program analysis that handles all of the features of C

and C++ is a monumental undertaking.

Another reason why static analysis tools have not been as successful as they could

have been is because many of them simply check general errors common to many

applications or to the implementation language, such as null pointer exceptions or

buffer overruns, and do not allow the user to specify their own rules to check[48].

Programs contain many non-generic, application-specific errors, and although they

are often designed with specific rules and invariants in mind, the programmer does

not have an easy way to specify those rules and check them using static analysis.

Static analysis for finding bugs is still a relatively new area of research and has

made a lot of progress over the last decade or so. Nonetheless, the pace of research

could be improved. Considering the amount of infrastructure necessary for building

program analyses, collaboration between researchers is not as common as it should

be. There are many compiler frameworks and everyone has his or her own favorite,

which unfortunately limits the impact of an analysis implementation and makes it

difficult to build upon. Creating an analysis that handles real-world conditions is very

1There are some notable exceptions such as FindBugs[67], which came from academia and is oneof the most practical and widely-used static analysis tools.


difficult and time-consuming, and therefore many researchers just build prototypes

that are not robust, complete, or well-documented enough to release to others. The

algorithms involved are tricky and hard to understand, leading to research factions

that use different terminology and a balkanization of the research space. Finally,

research papers are often missing the implementation details necessary to replicate

the work.

1.2.1 Pointer Analysis

Arguably one of the major hurdles for widespread adoption of program analysis in

software development is analysis imprecision. Imprecision in a static analysis can

lead to false positives, where the tool reports a bug where one does not actually exist.

Programmers have a low tolerance for false positives from a static analysis tool before

they stop using it. Many practical tools forgo soundness so as to not inundate users

with too many false warnings[48, 87, 220].

One major source of analysis imprecision comes from the use of pointers. Point-

ers are a feature of almost every major programming language in existence, and so

pointer analysis is essential to the understanding of almost any program. Despite

its importance, pointer analysis is still one of the most vexing problems in program

analysis. The problem of determining where a pointer can point to is undecidable in

general, so the answer is necessarily imprecise.

Pointer analysis algorithms that can scale to large programs are typically too im-

precise to be useful for bug finding. Scalable pointer analyses developed to date are

imprecise because they are either context-insensitive[29, 120, 151, 252] or unification-

based[88, 91]. A context-insensitive analysis does not distinguish between different

calling contexts of a method and allows information from one caller to propagate erro-

neously to another caller of the same method. In unification-based approaches, point-

ers are assumed to be either unaliased or are pointing to the same set of locations[227].

In contrast, inclusion-based approaches are more precise but also more expensive, as

they allow two aliased pointers to point to overlapping but different sets of locations.

Precise context-sensitive inclusion-based pointer analysis algorithms have not been

able to scale to large programs to date.


1.2.2 Context-Sensitive Program Analysis

Context sensitivity is one of the reasons precise pointer analysis is so hard. A context-

sensitive analysis analyzes methods differently depending on their calling context. It is

important for pointer analysis because what a pointer points to in a particular method

invocation depends greatly on the callers currently on the stack. Of course, there

can be an unbounded number of calling contexts in any program, due to recursion.

A reasonable approximation is to keep track of all calls not involved in recursive

cycles. Even then, in real programs it is not unusual for Java applications to have

over 1014 contexts. Coping with this explosion of contexts while still producing a

context-sensitive pointer analysis result is necessary to answer any of a large number

of interesting problems.

Most context-sensitive program analyses are summary-based[88, 255, 259]. Pa-

rameterized summaries are created for each method and used in creating the sum-

maries of its callers. It is not necessary to represent the results for the exponentially

many contexts explicitly with this approach, because the result of a context can be

computed independently using the summaries. Summary-based algorithms are usu-

ally very complicated, and it is often difficult to concisely summarize the effects of

a method. In any case, to answer queries as simple as “which variables point to a

certain object” can require all the results to be computed, which can be intractable

in a summary-based analysis.

1.3 Our Solution

Clearly, making it easier to specify, develop, and use advanced program analyses

would increase the usage and impact of program analysis on the software develop-

ment process. This dissertation presents bddbddb, which stands for Binary Decision

Diagram-Based Deductive DataBase. In bddbddb, we represent the entire program

and all analysis results as relations in a relational database. Program analyses are

specified in Datalog, a declarative programming language for deductive databases.


Using bddbddb, we were able to solve the important problem of scalable context-

sensitive inclusion-based pointer analysis, a problem for which no other efficient so-

lution is known.

1.3.1 Binary Decision Diagrams

bddbddb is unique in that it uses binary decision diagrams (BDDs) to represent

database relations. BDDs are a novel data structure that were traditionally used

for hardware verification and model checking, but have since spread to other areas.

The original paper on BDDs is one of the most cited papers in computer science[41].

The strength of BDDs is that they can represent data with high levels of redun-

dancy in a compact manner, as long as the data is encoded in such a way that the

redundancy is exposed. The data generated by program analysis, especially context-

sensitive program analysis, is very often quite regular and so is a good match for

BDDs. BDDs have already been used in a number of program analyses[22, 161, 264],

and more recently for pointer analysis[29, 253, 270, 271].

bddbddb translates each Datalog rule into a series of BDD operations, and then

finds the fix-point solution by applying the operations for each rule until the program

converges on a final set of relations. By using BDDs to represent relations, bddbddb

can operate on entire relations at once, instead of iterating over individual tuples.

Because Datalog operates on entire relations at a time, Datalog rule resolution maps

well onto BDD operations.

1.3.2 Cloning to Achieve Context Sensitivity

Furthermore, using BDDs to represent program analysis relations allows us to use an

easy formulation of context sensitivity. We formulate context sensitivity as cloning,

whereby results for all calling contexts are computed and explicitly represented[253].

In a cloning-based analysis, every distinct calling context invokes a different instance,

thus preventing information from one context to flow to another. Note that cloning

does not clone the code per se; it simply produces a separate answer for each clone.

Normally, cloning is intractable because the number of reduced call paths in a program


grows exponentially with the number of methods, and a cloning-based approach must

compute the result of every one of these contexts. Earlier attempts at cloning for

context sensitivity have only shown results on small programs[86]. Realistic programs

have many contexts; for example, the megamek application has over 1014 contexts (see

Section 6.1.1). However, by using the right encoding, BDDs can efficiently represent

analysis results for very large numbers of contexts.

Cloning makes generating context-sensitive results algorithmically trivial: We can

simply apply a context-insensitive algorithm to the cloned program to obtain context-

sensitive results. Cloning also makes it easy to extract information from the analysis.

By just writing a simple Datalog query, users can extract information for a partic-

ular context, or even ask under which contexts an analysis fact is true. Cloning

and deductive databases provide an easy and convenient abstraction for specifying

context-sensitive analyses and for using context-sensitive analysis information. In

fact, using cloning and BDDs we are able to solve context-sensitive problems for

which no other efficient implementation is known.

1.3.3 High-level Specification to Efficient Implementation

One very compelling aspect of bddbddb is that it gives the user an automatic way to

translate from a high-level analysis specification written in Datalog into an efficient

implementation. This has numerous benefits, as listed below:

Analysis implementation is greatly simplified.

Analyses expressed in a few lines of Datalog can take hundreds to thousands of lines of

code in a traditional language. By automatically deriving the implementation from a

Datalog specification, we introduce fewer errors. It also makes it much easier to make

changes to an analysis, as you only need to change the inference rules and not worry

about the implementation details. We have found that Datalog algorithms are much

easier to understand and maintain partly because the whole algorithm can easily fit

on a single page, so you can look at the whole thing at once.

Datalog with bddbddb has made the process of developing a program analysis a


lot more enjoyable. Rather than spending weeks implementing and reimplementing

a program analysis, we can just write a few Datalog rules and experiment with it

immediately. This has led us to try many more ideas and variations than if we had

implemented the analysis in a traditional way.

Using an easy high-level language like Datalog also makes it feasible for non-

experts to write their own analyses. The concept of storing the program and analysis

results in a relational database is a natural one and easily understood, even by people

who have little experience in program analysis. Thus, programmers can potentially

write their own rules to check the properties of their particular programs, without

having to deal with the complexities of implementing their own static analysis or fine-

tuning a BDD analysis implementation. Domain experts can write Datalog rules to

find bugs in programs in that domain. For example, an expert in Java web application

security wrote a query to find SQL injection vulnerabilities that takes advantage of

very precise pointer analysis information[160].

The specification is the implementation.

When using bddbddb, the Datalog analysis specification is what is actually being ex-

ecuted. This means that it is much easier to generate an correct, bug-free analysis by

using bddbddb than by implementing the analysis in a traditional way. It also quickly

exposes bugs and holes in the analysis specification, as they become readily apparent

once you experiment with an actual implementation of the inference rules. In quite a

few cases, after implementing an algorithm in bddbddb from the formal specification,

we discovered the formal specification was incomplete or just plain wrong. In other

cases, we discovered hidden assumptions that were not stated. By making the speci-

fication the same as the implementation, it makes it much easier to find specification

bugs and hidden assumptions.

Makes it easier to collaborate with others.

By specifying analyses in Datalog and executing them on a platform like bddbddb, it

makes it much easier for others to use and extend. Datalog specifications are easier to


understand and build upon than traditional implementations. Implementing program

analyses in Datalog is much more portable than a implementation in a traditional

language. An analysis is no longer tied to a particular compiler infrastructure; any

front-end will do as long as it can translate the input program into the input relations

required for the analysis. In fact, many analyses written in Datalog can be adapted to

new source languages very easily. A precise, executable specification in a language like

Datalog also makes the algorithm much clearer, because it clarifies the assumptions

of the analysis. An executable specification contains many more specifics of how the

analysis is implemented and the details of how it handles corner cases.

Furthermore, expressing analysis results as relations in a relational database gives

an easy, uniform way of expressing analysis results, independent of the analysis

specifics. It makes it easy to use the results of other program analyses, even if you

do not understand their implementation.

Allows high-level transforms of program analyses.

By specifying a program analysis in a uniform high-level language like Datalog, we

can easily transform the analysis to change its behavior or to add new features. One

example of this is to add context numbers to an existing context-insensitive analysis

to make a context-sensitive version. Another example is by using the magic-sets

transformation, we can automatically to make an analysis demand-driven[25, 201].

We present another example in Section 4.4 of the Datalog come-from query, where

we can automatically translate a program analysis specification into one which tells

exactly why a given analysis fact is true.

bddbddb generates efficient analyses.

The concept of formulating data-flow analysis in compilers as a database query is

not new. It was first proposed by Ullman[238]. In 1994, Reps investigated the use

of deductive database queries as a means of defining demand-driven interprocedural

data-flow analysis[200, 201]. However, implementations using logic programming sys-

tems are often slower than traditional implementations and can have difficulty scaling


to large programs. Reps reported an experiment using Corel, a general-purpose logic

programming system, to implement on-demand interprocedural reaching definitions

analysis[201]. It was found that the logic programming approach was six times slower

than a native C implementation. Dawson et al. used Prolog to perform groundness

analysis on logic programs and strictness analysis on functional programs[81]. Using

the XSB system, which has better efficiency than Corel[213], they were able to analyze

a number of programs efficiently. However, the programs they analyzed were small

— under 600 lines of code. Other recent work by Liu and Stoller on efficient Datalog

appears to have promise, but they do not present any performance results[156].

bddbddb differs in that it has been shown to generate highly efficient analysis

implementations that beat their hand-coded counterparts. We have several years

of experience in developing BDD-based program analyses and we have encoded our

knowledge and experience in the design of the tool. bddbddb uses a large number of

optimizations drawn from the database, compiler, BDD, and machine learning fields.

We have found the implementations generated by bddbddb to be very efficient, even

beating hand-coded analyses that have been carefully tuned for months[250]. This is

because bddbddb takes advantage of optimization opportunities that are too difficult

or tedious to do by hand.

bddbddb owes its efficiency to the power of the BDD data structure and the pres-

ence of highly-optimized BDD libraries. Berndl et al. showed that BDDs can be

used to implement context-insensitive inclusion-based pointer analysis efficiently[29].

Their work showed that a BDD-based implementation could be competitive in per-

formance with a traditional implementation. Zhu also investigated using BDDs for

pointer analysis[270, 271]. Beyond solving existing program analyses, by using BDDs

we can solve new, harder program analysis problems for which there are no other

known efficient algorithms.

1.4 List of Contributions

This dissertation makes the following contributions:


1. A context-sensitive inclusion-based pointer analysis that scales to

large programs.

This is the major contribution of this dissertation. We present the first context-

sensitive inclusion-based pointer analysis that scales to large programs. The

algorithm we have developed is remarkably simple. We first create a cloned

call graph where a clone is created for every distinct calling context. We then

run a simple context-insensitive algorithm over the cloned call graph to get

context-sensitive results. We handle the large number of contexts by represent-

ing them in BDDs and using an encoding scheme that allows commonalities

among similar contexts to be exploited. We show that it is possible to compute

context-sensitive points-to results for over 1014 contexts.

2. Demonstration that program analysis written in Datalog can be effi-

ciently performed using BDDs.

We show how program analysis written in Datalog can be efficiently performed

using BDDs. Whereas previous work shows that there is a penalty in writing

program analysis as database operations, we show that a BDD implementation

of Datalog for program analysis can be very efficient. We give examples of

some sophisticated program analyses, such as context-sensitive pointer analy-

sis, and show that the Datalog implementation runs faster than a hand-coded,

hand-tuned implementation.

3. Establishment of Datalog as a effective high-level language for BDD-

based program analyses.

We present numerous examples of program analyses implemented in Datalog.

We show that the Datalog programs are orders-of-magnitude shorter and easier

to write. They are so succinct and easy to understand that we use them to

explain all our algorithms here directly. All the experimental results reported in

this paper are obtained by running the BDD programs automatically generated

by bddbddb.

4. A complete description of the bddbddb system.

This dissertation contains a complete description of the bddbddb system. We


describe in detail how bddbddb translates from Datalog into efficient, optimized

BDD operations, and report on the performance gains due to various optimiza-

tions. We describe some of the advanced query features such as the come-from

query. We also describe an active learning algorithm developed with Michael

Carbin to automatically find effective BDD variable orders and give performance

results showing its effectiveness.

5. Insights into using BDDs for program analysis.

Before building bddbddb, we had amassed considerable experience in developing

BDD-based program analyses. Much of that knowledge went into the design of

the tool and our algorithms. This dissertation shares many of those insights,

which is interesting to anyone who uses BDDs for program analysis.

6. Complete Datalog implementations of many program analyses.

Besides the context-sensitive pointer analysis, we give examples of many other

analyses implemented in Datalog, including algorithms that utilize context-

sensitive pointer analysis results, other types of context-sensitive algorithms

such as context-sensitive type analysis and object-sensitive analysis, and so-

phisticated and practical algorithms such as an algorithm for static race detec-

tion. Some of the algorithms were designed by the author, other algorithms are

based on the work of other researchers. All of the analyses are specified in just

a few Datalog rules and are reprinted in their entirety; the Datalog programs

are more or less identical to those supplied to bddbddb, modulo some minor

syntactic differences.

7. Experimental results on a variety of program analyses.

We present experimental results on a variety of program analyses over multiple

input programs that show that bddbddb is effective in generating BDD analyses

from Datalog specifications. In particular, we compare bddbddb to some hand-

coded, hand-optimized BDD program analyses and show that bddbddb is twice

as fast in some cases, while also being far easier to write and debug.

We also present the analysis time and memory usage of some of the analyses


presented in this dissertation across 21 of the most popular Java applications

on Sourceforge. Our context-sensitive pointer analysis can analyze even the

largest of the programs in under 19 minutes. We also compare the precision

of context-insensitive pointer analysis, context-sensitive pointer analysis and

context-sensitive type analysis, and show the effects of merging versus cloning

contexts.

1.5 Organization

The remainder of this dissertation is organized as follows. Chapter 2 gives some

background on the research areas relevant to the rest of the dissertation — namely,

program analysis (2.1), pointer analysis (2.2), Datalog (2.3), binary decision dia-

grams (2.4), and context sensitivity (2.5). Readers with prior knowledge may want to

skim this chapter. Chapter 3 describes my approach to pointer analysis and context

sensitivity using BDDs. Chapter 4 describes the bddbddb system in detail. Chapter 5

describes various applications of the bddbddb system by myself and others. Chapter 6

presents some experimental results from the system. Chapter 7 compares my work

with related work, and Chapter 8 concludes.

Chapter 2

Background

This chapter provides the background necessary for the understanding of the remain-

der of this thesis. We define the terminology and conventions that we use. We give

an overview and discuss the research areas that are related to the contents of this the-

sis. We also refer the reader to appropriate source material that provides additional

background information.

This chapter is split up into five sections, giving background on the areas of

program analysis (2.1), pointer analysis (2.2), Datalog (2.3), binary decision dia-

grams (2.4), and context sensitivity (2.5). Readers with prior knowledge of these

areas may want to skim this chapter and refer back to it later if they encounter

something they do not understand.

2.1 Program Analysis

Program analysis is the process of analyzing the behavior of a computer program.

This thesis focuses on static program analysis, which is the process of analyzing the

source code of a program to deduce its behavior without actually running it. This

technique contrasts with dynamic program analysis, which analyzes a program as it

executes.

Most static program analyses are “sound”, which means they give a correct, but

14

CHAPTER 2. BACKGROUND 15

perhaps conservative, answer to the problem.1 An “unsound” analysis, on the other

hand, can give incorrect results. Dynamic analyses are most certainly unsound, as

they can only gain information from the particular executions they witness, which

will not be exhaustive for any non-trivial program.

2.1.1 Applications of Program Analysis

Program analysis has many applications. Traditionally, program analysis was used

for optimizing programs. Most program analysis research prior to 2000 was focused

on analyzing programs to make them run faster. More recently, program analysis has

been applied more to software quality applications. Some examples are using program

analysis for program understanding, debugging, testing, and reverse engineering.

2.1.2 Styles of Program Analysis

There are a few common styles of program analysis. The first and one of the oldest

is dataflow analysis[139]. Dataflow analysis attempts to compute the possible set

of values at all program points in a control flow graph. It operates by setting up

dataflow equations at every node in the control flow graph and iterating until the

analysis reaches a fix-point solution. To guarantee termination, the lattice of values

must be monotone and have finite descending chains[5]. Dataflow analysis computes

the maximum fix-point (MFP) solution, which is less precise than the meet-over-paths

(MOP) solution if the transfer functions are not distributive.

Abstract interpretation is another common style of program analysis. Abstract

interpretation attempts to capture information about the program semantics by sym-

bolically executing the program[71]. Abstract interpretation has traditionally been

very accurate but very expensive, and has typically been used only on small program

fragments.

Another common style of program analysis is constraint-based program

analysis[8]. Constraint-based analysis operates by setting up a system of constraints

1This definition of soundness is unfortunate as it does not match that used in other areas suchas mathematics; in other areas, this notion would be called “complete”. I have decided to go withthe word “sound” in order to be consistent with the rest of the program analysis community.


and then using a constraint solver to generate a solution. The most common con-

straints are equality constraints, which can be solved with a near-linear-time union-

find data structure, and inclusion constraints, which require O(n3) time in the worst

case. Constraint-based analyses are most typically used in flow-insensitive analyses,

as they do not need to take into account the control flow of the program, and can

typically handle an entire program at a time.

Model checking attempts to check whether a given model (the program to be

analyzed) satisfies some logical formula (the program property to check)[45]. The

analysis starts in some initial state, and a transfer function encodes the possible

transitions between states. Model checking works by exploring the states to check

if there is a state where the formula is satisfied. Because model checkers typically

cannot explore the entire state space, they are usually unsound.

One big difference between these program analysis techniques is the scope of the

analysis. Program analyses can be intraprocedural, which means they analyze the

program using only information that is local to a single procedure, or interproce-

dural, which means they analyze multiple procedures at once. Dataflow and ab-

stract interpretation are typically only done at an intraprocedural level, whereas the

constraint-based and model checking approaches are typically interprocedural.

2.2 Pointer Analysis

Pointer analysis (also called points-to analysis) is a static program analysis used to

determine which pointers can point to which locations. It has been the subject of a

lot of compiler research over the last two decades, with hundreds of papers published

on the subject[123]. The first pointer analysis was published by Weihl in 1980[241].

Pointer analysis is important to the understanding of any program that uses point-

ers. Without knowing the locations that a pointer can point to, an analysis is missing

a huge amount of information. When encountering a pointer dereference, an analysis

without good pointer information must either be very conservative or unsound.

Unfortunately, the problem is undecidable in general, even when the program

has only a single procedure and no dynamic allocation[56]. When the problem is


restricted to be flow-insensitive, the problem is “only” NP-hard[127]. Thus, pointer

analysis algorithms must use heuristics and imprecise algorithms that work well on

real programs.

Pointer analysis had been given somewhat of a bad rap as it was notoriously com-

plicated and expensive with little realized benefit in terms of optimization[207]. For

most traditional optimizations, the accuracy of pointer analysis had little effect on

their efficacy. Whether a pointer pointed to five locations or five hundred locations

made little difference to most optimizations. However, accurate pointer analysis is es-

sential when it comes to analyses that require deeper code understanding; for example,

in sophisticated bug finding tools[159], automatic parallelization[208], or specialized

analyses like escape analysis[255].

2.2.1 Types of Pointer Analysis

Different pointer analyses can be categorized by their sensitivity in different

dimensions[123, 211]. We go through each of these dimensions in turn.

Flow Sensitivity

Flow sensitivity refers to whether the analysis information is different at each program

point, or if there is a single copy of the information for all program points. Flow-

insensitive analyses compute one solution for the whole program, whereas a flow-

sensitive analysis computes a solution for each program point. Thus, flow-insensitive

analyses are usually less precise but more efficient than flow-sensitive analyses.

In terms of pointer analysis, flow sensitivity is important because it allows strong

updates[259]. A strong update is when a store instruction overwrites the previous

contents of the destination. A strong update is only possible if the destination of

the store instruction is a single unique location, also known as must-points-to. Flow

sensitivity is required to distinguish between before and after the store instruction.

One can obtain much of the benefit of flow sensitivity by first performing an SSA

transformation on the input program, then performing a flow-insensitive analysis on

the transformed program[118]. An SSA transformation transforms the program so


that each variable has only a single assignment statement[73]. Doing such a trans-

formation gains some of the benefit of flow sensitivity, most specifically for local

variables, without the cost and complexity of a fully flow-sensitive analysis.

Path Sensitivity

Beyond flow sensitivity is path sensitivity. Path sensitivity gives not only different

results for each program point, but different results for each control flow path to reach

that point[35]. As such, path sensitivity has a exponential (or in the case of loops,

unbounded) nature and is therefore only practiced in a limited manner, for example

on a single method at a time[260] or to discover a specific property[76].

Context Sensitivity

One of the most important aspects of pointer analysis is context sensitivity. This

topic is so large we have devoted an entire section to it; see Section 2.5. Briefly,

context sensitivity refers to whether an analysis has different results for a method

depending on its calling context. As methods are often called with different pointers,

some form of context sensitivity is essential to the accuracy of a pointer analysis.

Inclusion versus Unification

There are two techniques used in a flow-insensitive pointer analysis to represent con-

straints. The first is to use inclusion constraints, which means two different locations

can point to different but overlapping sets of locations[12]. An assignment statement

“x = y;” would be interpreted as “the points-to set of x includes everything that y

points to”.

A second technique is to use unification constraints, which means that two different

locations that may point to overlapping sets of locations are collapsed into a single

equivalence class and become indistinguishable[227]. Thus, an assignment statement

“x = y;” merges the points-to sets of x and y, along with the points-to sets of *x

and *y, **x and **y, etc. Also, in a unification-based analysis assignment statements


lose their directionality, and so the statement “x = y;” is equivalent to the statement

“y = x;”.

Inclusion constraints are precise because they can accurately represent overlap-

ping sets of locations. However, they can be quite expensive as it can take a large

amount of memory to store the overlapping sets. Inclusion constraints also require

computation of transitive closure, which takes quadratic time. Unification constraints

are much more efficient because they can be solved with a near-linear-time union-find

data structure[232]. However, they are much less precise because a single pointer

imprecision can cause otherwise distinct pointers to become indistinguishable.

Some analyses use a hybrid technique of both inclusion and unification constraints.

The one-level-flow algorithm uses inclusion constraints for assignments, but unifica-

tion constraints everywhere else[75]. It is able to achieve some of the benefit of

inclusion-based analysis with an analysis cost only slightly higher than that of a

completely unification-based analysis. More object-oriented languages such as Java

may require multiple levels of inclusion constraints in order to achieve satisfactory

precision[77].

Naming Objects

Another important dimension of pointer analysis precision is how the analysis names

objects. The resolution of object names can have a huge effect on analysis precision.

The typical technique for naming a object is to name it by the call site that

allocates it. However, this has some major disadvantages. If the program creates

objects in a factory method, all objects will have the same creation site and therefore

will be indistinguishable to the analysis. Likewise, many object-oriented programs

are encapsulated, where an outer object creates and maintains ownership of an inner

object. In such a case, inner objects that belong to different outer objects will be

indistinguishable. Allocation sites in loops can also be a source of imprecision.

Some simple techniques can go a long way to improving precision on object names.

Inlining of factory methods, which can often be identified with simple heuristics, and

inlining of constructors, where inner objects are often allocated, solve part of the

problem. Loop unrolling can also help imprecision caused by allocation sites in loops.


Other more sophisticated techniques take into account the calling context of the

object creation site. k-l-CFA is a variant of k-CFA, described in Section 2.5.1. It uses

a constant l levels of call site context[104, 181]. In object-sensitive analysis, objects

can also be named by the object that created them[171].

Changing the meaning of object names can also give interesting analysis variants.

For example, if we choose to name objects by their type rather than by their creation

site, our pointer analysis becomes a type analysis. All of the same dimensions of

sensitivity still apply. This shows the duality between pointer analysis and type

analysis.

Field Sensitivity

Another dimension of pointer analysis sensitivity is the question of how the analysis

deals with aggregate objects. C pointer analyses are typically field-insensitive —

they combine the analysis information for all fields of an object, leading to significant

imprecision[189, 204]. For example, in a field-insensitive analysis, x.f and x.g would

be indistinguishable to x. The reason most C pointer analyses are field-insensitive is

because C supports pointer arithmetic and is not type-safe, so correctly distinguishing

between field accesses in a sound manner is nontrivial[16, 265].

Field-sensitive analyses distinguish between the different fields of an object. Most

pointer analyses in type-safe languages such as Java are field-sensitive. However,

most analyses are not array-element-sensitive — they simply treat all elements of an

array as a single location. More sophisticated analyses keep track of information for

distinguishing array accesses, such as stride-offset[259].

Another technique is field-based analysis, where all identically-named fields are

merged across all objects[120]. For example, in a field-based analysis x.f and y.f

would be considered identical. This technique is rather imprecise as it does not

distinguish between different heap objects of the same type; however, it can be made

exceptionally fast, as the number of distinct locations the analysis must deal with is

drastically reduced.

Of all these techniques, only field-sensitive analyses allow for strong updates when


storing to a field. This is because the other techniques are too imprecise to determine

if a field access is referring to a single unique location.

2.2.2 Call Graph Discovery

Whole-program analyses require a call graph. For an analysis to be sound, it must

have the complete call graph. However, an imprecise call graph can lead to longer

analysis times and less accuracy. Language features such as indirect calls, virtual

calls, and reflection can make constructing an accurate call graph difficult.

There are two techniques for dealing with call graphs — analyses can either re-

quire a call graph to be supplied, for example by an earlier analysis pass, or they can

compute it on-the-fly based on the analysis information computed thus far. Comput-

ing on-the-fly gives better accuracy but adds complexity to the analysis[103]. It can

also require more iterations to converge, leading to longer analysis times.

Precomputing a call graph

There are a few common techniques to use when generating an initial call graph.

The simplest technique is to just include everything; all functions are assumed to

be reachable and indirect calls are assumed to be able to call any compatible target

function. This is obviously very imprecise. Class hierarchy analysis (CHA) simply

analyzes the class hierarchy of the program to determine possible targets to virtual

calls[83]. This is slightly better than including everything.

Rapid type analysis (RTA) finds all instantiated types in the program and only al-

lows calls to those types[18]. This can be thought of as an extremely simplified pointer

analysis where all locations in the program are abstracted by a single node and objects

are named by their type. We can also use a relatively inexpensive pointer analysis,

such as a context-insensitive pointer analysis, to obtain an accurate call graph[29].

Call graph accuracy does not improve appreciably with context sensitivity[253].


Generating a call graph on-the-fly

A more precise technique is to generate the call graph as the analysis runs, using the

current points-to information to resolve indirect and virtual calls[103, 151, 252]. In

this case, the analysis starts with a set of root methods that are the entry points

of the program. Using the current points-to information, it calculates possible call

targets and adds the target methods to the analysis. As the analysis encounters new

target methods and call edges, it adds the constraints to model those target methods

and call edges.

Besides making the analysis more complex, generating the call graph on-the-fly can

also increase the number of iterations necessary for convergence. Some cloning-based

analyses also use a preexisting call graph to calculate an efficient representation[253];

performing call graph discovery on-the-fly can lead to a less efficient representation.

Complexities in call graph generation

Beyond virtual calls, there are numerous complexities that make it difficult to generate

an accurate call graph. One complexity is the feature of reflection. Reflection allows

the programmer to perform operations dynamically by specifying their names[90, 106].

For example, using reflection a programmer can load an class, create a new object,

read or write to a field, or call a method simply by specifying its name. Determining

the target of a reflective call is obviously an undecidable problem in general. However,

many uses of reflection are stylized and therefore we have a hope of analyzing them

accurately[157].

Another complication in call graph generation of dynamic languages is the problem

of downloaded or dynamically generated code. Java and other dynamic languages

allow the user to download code off of the network or generate it dynamically and

execute it. This makes it very difficult or impossible to analyze the whole program

before executing it. Another problem is native code, where some of the code is in a

different language or even just the binary code. Few analyses actually analyze the

native code portion[247]; most simply use a predefined model for common native

method calls, or just ignore them altogether.


One big problem when doing static analysis of libraries is that we may not have

any code that calls the library or the code may be incomplete. In this case, it may be

possible to automatically generate a driver that calls the methods in the interface with

appropriate arguments[177]. Generating a driver that captures all of the possibilities

with respect to the analysis in question is a difficult problem.

Most analyses simply ignore all of these complications, which leads them to be

both incomplete because they miss parts of the call graph and unsound because some

operations are ignored.

2.3 Datalog

In this section, we give a brief introduction to Datalog, the language we use to specify

program analyses.

2.3.1 Introduction to Datalog

Datalog is a logic programming language for deductive databases, first formalized by

Ullman[238], but has its roots much further back, at a 1978 workshop organized by

Gallaire and Minker[97, 174]. A comprehensive review of Datalog was published by

Ceri[53].

A Datalog program consists of a set of rules, written in a Prolog-style notation,

where a predicate is defined as a conjunction of other predicates. For example, the

Datalog rule

D(w, z) : – A(w, x), B(x, y), C(y, z).

says that “D(w, z) is true if A(w, x), B(x, y), and C(y, z) are all true.” Variables

in the predicates can be replaced with constants, which are surrounded by double-

quotes, or don’t-cares, which are signified by underscores. Predicates on the right

side of the rules can be inverted.

In the example above, the predicate to the left of the : – symbol is called the head

predicate. Predicates to the right of the : – symbol are called subgoal predicates.


Datalog operates on relations. Conceptually, a relation is a two-dimensional table.

The columns are the attributes, each of which has a domain defining the set of possible

attribute values. The rows are the tuples of attributes that share the relation. If tuple

(x, y, z) is in relation A, we say that predicate A(x, y, z) is true.

Datalog has two kinds of relations. The first is called the extensional database, or

EDB. It consists of the relations present in the initial database. The second is called

the intensional database, or IDB. It consists of the relations that are computed by

the Datalog program.

Comparing Datalog and Prolog

Datalog has its roots in Prolog, a logic programming language commonly used in

artificial intelligence and computational linguistics[65]. The evolution from Prolog to

Datalog is one of transition from a procedural (record-oriented) language to a non-

procedural (set-oriented) language. This evolution mirrored the evolution of database

systems from a hierarchical model to the relational data model.

Prolog and Datalog are very similar. The major differences are as follows:

• Datalog disallows complex terms as arguments of predicates, e.g. A(f1(x), y) is

not allowed.

• Datalog does not specify an evaluation strategy and the order of the rules is

irrelevant. Prolog has a defined top-down evaluation strategy and it evaluates

rules in the order they are given in the program.2

• Datalog has restrictions on negation and recursion that guarantee termination

and a well-defined minimal solution.

Our Dialect of Datalog

In addition, the Datalog we use has the following differences from standard Datalog:

2This leads Prolog to be neither sound nor complete, as it lacks an extended occurs check[219].


• Totally-ordered finite domains. As a byproduct of the way we encode relations,

a variable must be of a given domain and a domain must be of a finite size. The

size can be extremely large, but it must be finite. In pure Datalog, programs

are domain-independent and there is no restriction of finiteness.

• Comparison operators. Because domains are totally ordered, we provide built-in

comparison operators <, >, ≤, ≥, = and 6=.

• Unbounded variables in the head predicate. Standard Datalog does not allow

unbounded variables in the head predicate, as the answer relies on the domain

of interpretation and can be infinite. Such programs are called unsafe[238]. As

we require a finite domain for every variable, we can safely allow unbounded

variables in the head predicate.

In addition, the disjunction syntax of Prolog (e.g. a : – b; c; d.) is not supported.

Instead, you must explicitly write separate rules (e.g. a : – b. a : – c. a : – d.).

2.3.2 Datalog Semantics

Datalog is more powerful than SQL, which is based on relational calculus, because

Datalog predicates can be recursively defined[238].3 Other query languages based on

first-order logic are also incapable of expressing many simple recursive queries like

finding the transitive closure of an input graph[28].

If none of the predicates in a Datalog program is inverted, then there is a guaran-

teed minimal solution consisting of relations with the least number of tuples. However,

disallowing negation severely restricts the expressive power of the language. While

the exact expressive power of Datalog without negation has not been characterized

completely, it has been shown that Datalog without negation only captures a proper

subset of monotonic polynomial-time queries[2].

However, Datalog programs with inverted predicates may not have a unique mini-

mal solution. For example, consider the Datalog program in Figure 2.1. This Datalog

program has two minimal solutions:

3SQL:1999 incorporated some ideas from Datalog to support recursive queries[108].


A(1, 2).

B(2, 3).

C(x, y) : – A(x, y).

C(x, y) : – C(x, z), C(z, y).

D(x, y) : – B(x, y),¬C(x, y).

Figure 2.1: An example Datalog program with two minimal solutions.

1. C(1, 2), D(2, 3)

2. C(1, 2), C(2, 3), C(1, 3)

Both solutions are minimal; the first is minimal with respect to C, the second

is minimal with respect to D. It is not immediately obvious which is the preferable

solution. However, what if the relations were described as follows:

• A(x, y) = Airline A has a flight from x to y.

• B(x, y) = Airline B has a flight from x to y.

• C(x, y) = You can fly from x to y using only Airline A.

• D(x, y) = Airline B has a monopoly from x to y, i.e. there is no way to get from

x to y using only Airline A.

In that case, the first solution is obviously the one that was intended. Concerns

about the semantics of Datalog programs in the presence of negation lead to the

development of more sophisticated semantic models.

Stratified Negation

In order to avoid the semantic and implementation problems connected with the un-

restricted use of negation, we use stratified negation[57]. Stratified negation restricts

the use of negation to non-recursive rules. Informally, rules in stratifiable programs

can be grouped into strata, each with a unique minimal solution, that can be solved


in sequence. (The Datalog program in Figure 2.1 above is stratifiable; the stratified

solution is the first one shown.) Datalog with stratified negation has been shown to

express a proper subset of fix-point queries[143]. We have found it expressive enough

to solve the program analysis problems we have encountered thus far.

Well-founded model and other semantic models

There exist other semantics for negation in Datalog. One of the most notable is

the well-founded model[239], which expresses exactly fix-point queries[98]. The idea

behind the well-founded model is to solve the Datalog program until an oscillation

is detected; at that point, one can choose either the low point or the high point of

the oscillation as the minimal model. Another popular model is the stable model[99],

which is complicated by the fact that a single program can actually have several

stable models. We can enforce determinism in the stable model by querying a literal

and returning the answer true if the literal is true for some stable models (possible

semantics) or all stable models (certain semantics). Possible semantics and certain

semantics extend Datalog to be able to express all decision problems in the classes

NP and co-NP, respectively[162, 218].

An excellent resource on the semantics and expressibility of different Datalog

models was published by Greco et al[100].

2.3.3 Evaluation Strategy

There exist two common evaluation strategies for solving Datalog programs: top-

down evaluation and bottom-up evaluation[137]. In top-down evaluation, you start

with a goal and then compute the predicates necessary to prove or disprove that

goal. In essence, in top-down evaluation you construct a proof tree from the top to

the bottom, similar to a resolution-based theorem prover. In bottom-up evaluation,

you start inferring new facts from the set of base facts until all answers to the goal

are found. The origins of bottom-up evaluation can be traced to a paper by Aho

and Ullman on relational algebra with a fix-point operator[6]. Bottom-up typically

reduces the overall cost due to fewer iterations and more set-oriented computation;


even though it may do extra work, it suffers less from constant per-join overheads

and per-iteration overheads[193]. Bottom-up evaluation also easily guarantees termi-

nation and completeness, whereas top-down requires a potentially expensive occurs

check[242].

2.3.4 Magic Sets Transformation

By using the magic sets transformation, we can obtain some of the goal-directed ben-

efits of top-down evaluation with the efficiency benefits of bottom-up evaluation[25,

237]. The magic sets transformation adds magic filters to each rule that restrict the

computation to just the set of facts that are relevant to a given goal.

One downside to the magic sets transformation is the fact that applying the trans-

formation to a stratifiable program may result in an unstratifiable program, requiring

a more sophisticated well-founded evaluation strategy[138]. Another downside is that

using magic sets may lead to more iterations and less set-oriented computation as it

discovers it needs to know more and more facts.

2.3.5 Other Datalog variants

There are many other variants of Datalog in existence; we discuss a few of them here.

Disjunctive Datalog is a variant of Datalog where disjunctions may appear in the

rule heads[85]. Allowing disjunctions greatly changes the semantics of the language

and increases expressibility at the cost of making resolution much more complicated.

Datalog with functions adds function symbols to Datalog[1]. Adding functions makes

consistency undecidable in general, but decidable in some restricted cases. Datalog

LITE (LInear Time Datalog with Extensions) is a Datalog variant that combines

some of the semantics of Datalog with linear time model checking from temporal

logic. Probabilistic Datalog is a probabilistic extension of Datalog[93]. It allows

probabilities (“weights”) to be assigned to individual tuples in a relation.


2.4 Binary Decision Diagrams

Binary decision diagrams are a data structure that represents a boolean function.

They are used in a wide variety of areas. They were originally invented for hard-

ware verification to efficiently store a large number of states that share many

commonalities[41]. Since then, BDDs have been used in symbolic model checking,

circuit synthesis, polynomial algebra, finite state machine traversal, integer linear

programming, prime generation, matrix representation, data compression, program

analysis, and many, many other areas. The original paper on BDDs is one of the

most cited papers in computer science[41].

A BDD is a directed acyclic graph (DAG) with a single root node and two terminal

nodes which represent the constants one and zero. This graph represents a boolean

function over a set of input decision variables. Each non-terminal node t in the DAG

is labeled with an input decision variable and has exactly two outgoing edges: a high

edge and a low edge. To evaluate the function for a given set of input values, one

simply traces a path from the root node to one of the terminal nodes, following the

high edge of a node if the corresponding input variable is true, and the low edge if it is

false. The terminal node gives the value of the function for that input. Figure 2.2(b)

shows a BDD representation for function f from Figure 2.2(a). Each non-terminal

node is labeled with the corresponding decision variable, and a solid line indicates a

high edge while a dashed line indicates a low edge.

We specifically use a variant of BDDs called reduced ordered binary decision di-

agrams, or ROBDDs[41]. In an ordered BDD, the sequence of variables evaluated

along any path in the DAG is guaranteed to respect a given total decision variable

order. The choice of the decision variable order can significantly affect the number

of nodes required in a BDD. The BDD in Figure 2.2(b) uses variable order b1, b2, b3,

while the BDD in Figure 2.2(c) represents the same function, only with variable order

b2, b1, b3. Though the change in order only adds one extra node in this example, in

the worst case an exponential number of nodes can be added. In addition, ROBDDs

are maximally reduced meaning common BDD subgraphs are collapsed into a single

graph, and the nodes are shared. Therefore, the size of the ROBDD depends on


D1 D2 Rb1 b2 b3 f0 0 0 00 0 1 00 1 0 00 1 1 11 0 0 11 0 1 11 1 0 11 1 1 1

(a)

0 1

1b

2b

3b

(b)

1b 1b

2b

3b

10

(c)

Figure 2.2: (a) Binary encoding of a relation. (b) and (c) are BDD encodings of therelation given by (a) with decision variable orders b1, b2, b3 and b2, b1, b3, respectively.

whether there are common boolean subexpressions in the encoded function, rather

than on the number of tuples that are true.

2.4.1 BDD Variable Ordering

Because the size and efficiency of the data structure varies so greatly depending on

the variable ordering, finding a good variable ordering is of crucial importance. Un-

fortunately, the problem of finding the optimal variable ordering is an NP-complete

problem[37]. The problem is also nonapproximable; it is NP-complete to even com-

pute a variable ordering resulting in a BDD with a size that is at most a constant

c > 1 times larger than an optimal one[223].

There are some functions, such as the multiplication function, for which the graph

size is always exponential independent of variable ordering[42].

Because the problem of BDD variable ordering is so important to the use of BDDs,

there has been an extensive amount of work in this area[115]. Although some progress

has been made on ordering heuristics and dynamic reordering techniques for BDDs

in specific domains, they have not proved effective in general[196].


Dynamic Reordering of BDD Variables

One technique for dealing with BDD variable ordering is to use a dynamic reordering

technique. The optimal variable order may change as the computation progresses,

so by dynamically reordering we can theoretically be more efficient than by simply

using a static variable order.

Unfortunately, dynamic reordering is a very expensive operation, so it rarely im-

proves execution time over a good static ordering. A ten-times slowdown when using

dynamic reordering is not uncommon. However, dynamic reordering can help reduce

memory requirements. Typically, when the BDD package reaches some memory limit,

it triggers a dynamic reordering in an attempt to reduce its memory usage.

Most dynamic variable reordering algorithms work by swapping adjacent variables.

Swapping adjacent variables has a localized effect — we only need to modify nodes

labeled by the swapping variables and we do not need to change any incoming pointers.

One of the most common dynamic variable ordering algorithms is the sifting

algorithm[205]. The sifting algorithm simply chooses a candidate variable, tries all

positions for that variable by repeatedly swapping with adjacent variables, and then

moves the variable to the best position found. Many dynamic reordering algorithms

are variants of the sifting technique.

One technique for deciding which BDD variables to reorder is simulated

annealing[36, 132, 167]. Simulated annealing works by replacing the current solu-

tion with a random nearby solution. Different choices for the solution neighborhood

give algorithms with different characteristics.

Friedman et al. present an algorithm to find the optimal variable ordering for a

given BDD[92]. It uses a dynamic programming approach. Unfortunately it requires

an exponential amount of time and space, and is highly impractical for all but the

smallest BDDs.

Machine Learning for BDD Variable Orders

One promising approach is to use machine learning to automatically find good vari-

able orders. Grumberg et al. studied the problem of using BDDs in hardware


verification[105]. They randomly sampled a number of variable orderings, measured

the size of the resulting BDD, and then used machine learning techniques to derive the

dominant characteristics of good performing orders. From these characteristics they

could then generate orders that were comparable to those found using an analytic

technique.

2.4.2 Other Variants

Other variants of binary decision diagrams have been developed to address some of

the problems with BDDs. Boolean expression diagrams (BEDs) are a generalization

of BDDs that can represent any boolean circuit in linear space[11]. They have been

used to efficient represent circuits such as multiplication circuits, which cannot be

expressed efficiently using BDDs.

Zero-suppressed binary decision diagrams (ZDDs) are similar to BDDs but use a

different reduction rule — rather than removing nodes that have the same high and

low successor, ZDDs remove nodes where the high successor is zero[173]. This makes

ZDDs generally more efficient for representing sparse sets.

Binary moment diagrams (BMDs) are a generalization of BDDs to linear functions

over domains other than booleans, such as integers or real numbers[43]. Like BDDs,

each function still has a single canonical representation, but by choosing a different

domain BMDs can represent some functions that are inefficient in BDDs, such as the

multiplication function.

2.5 Context Sensitivity

Context sensitivity refers to whether an analysis has different results for a method

depending on its calling context. In type systems, this notion is often referred to as

polyvariance.

Context sensitivity is very important for most interprocedural analyses because it

keeps information from different contexts separate. In a context-insensitive analysis,

information from one calling context can flow to another, leading to imprecision.


As the analysis information from different calling contexts is usually quite different,

context sensitivity can have a major impact on accuracy.

Most of the benefit of a context-sensitive pointer analysis can only be realized

when subsequent analyses use the information in a context-sensitive manner[149, 163,

253]. Performing a context-sensitive analysis and then throwing away the context

information gives little to no improvement over simply using a context-insensitive

analysis in the first place[253].

2.5.1 Definitions of Context

There are numerous ways to specify a calling context. The most common is to name

it based on the chain of call sites that reach that method. This is known as the call

strings approach or k-CFA, where k is the number of call sites to consider in the

context[222]. Using a fixed value of k makes little sense, as the number of call sites to

achieve the desired precision varies greatly across a program. For example, adding a

simple wrapper function can cause the analysis precision to drop significantly; mean-

while, the analysis would be suffering from an exponential blowup in another part

of the program where there are no further efficiency gains to be made. One notable

version of k-CFA is ∞-CFA, where the entire call path to the root, sans cycles, is

used as the context[86, 253].

Another technique for distinguishing calling contexts is to use some of the current

analysis information at the point of a call site. One example is object sensitivity,

which distinguishes between contexts by the values of the receiver object[169]. Ob-

ject sensitivity can also be extended to incorporate k levels of objects in k-object

sensitivity[171]. The Cartesian product algorithm (CPA) considers a context to be

the receiver object type and the types of the arguments[3]. It computes each tuple

of the Cartesian product of the types of the arguments as an independent case. As

the number of tuples in the Cartesian product can grow rapidly as the number of

types increases, CPA is usually implemented as bounded CPA, which uses a single

context-insensitive summary to analyze any call site at which the number of tuples

in the Cartesian product extends a given constant threshold value[104].


2.5.2 Techniques for Solving Context-sensitive Analyses

The techniques described above can be considered cloning-based context sensitivity, in

that they conceptually “clone” a method once for each context and compute the anal-

ysis results separately for each clone. Another technique is summary-based context

sensitivity, which calculates a summary of the effects of a method and then applies

that summary at the various call sites. Cloning-based and summary-based are also

known as the call-strings approach and the functional approach, respectively[221].

Compositional context-sensitive algorithms calculate a single summary for a

method, which is then applied at all call sites that invoke that method[58, 255].

This leads to greater efficiency as non-recursive methods only need to be analyzed

once, in a bottom-up pass through the call graph. However, making an analysis

compositional can negatively affect precision as the computed summary must be ap-

plicable under all possible contexts. For example, it is very difficult to compute a

compositional pointer analysis summary that supports strong updates in their full

generality, as strong updates differ depending on the pointer aliasing information

from the context[216].

Another summary-based technique is to use partial transfer functions (PTFs)

for context sensitivity[259]. A partial transfer function summarizes input/output

behavior for only a subset of possible inputs. PTFs can be thought of as a form of

memoization — we compute a new transfer function for each distinct input as it is

encountered, and reuse old computed transfer functions for inputs that are the same.

Partial transfer functions work well for analysis problems whose solutions depend on

the context (such as with strong updates) but for which very few distinct contexts

actually occur in practice[175].

Another technique for solving context-sensitive analysis problems is to express the

problem as a context-free language reachability problem[199]. The basic idea is to

define a context-free grammar with matched parenthesis that correspond to the calls

and returns in the program. CFL-reachability is attractive as it can be solved in

O(n3) time.

Chapter 3

Pointer Analysis

Pointer analysis is used to determine which pointers can point to which locations. It

is essential for the understanding of any program that uses pointers. However, pointer

analysis algorithms are notoriously difficult and accurate context-sensitive algorithms

have typically not been able to successfully scale to large programs.

This chapter begins by showing how to specify a simple context-insensitive pointer

analysis in Datalog. We then show how to make the analysis progressively more so-

phisticated by adding more and more Datalog rules. Next, we present the main

contribution of this chapter, the technique of using BDDs to implement context sen-

sitivity through cloning. This gives us a straightforward way of adding context sen-

sitivity to any analysis. We close the chapter with an example of using the same

cloning technique to implement a different kind of context sensitivity called object

sensitivity.

All of the Datalog examples in this chapter can be supplied to the tool bddbddb,

which translates the Datalog into efficient BDD operations. We defer the discussion

of the inner workings of bddbddb to next chapter.

3.1 Context-Insensitive Pointer Analysis

We show how Datalog can be used to describe a context-insensitive inclusion-based

pointer analysis, similar to the algorithms of Berndl et al[29] or Andersen[12]. As

35

CHAPTER 3. POINTER ANALYSIS 36

you will see, Datalog is a natural language for specifying constraint-based program

analyses. The analysis is only four Datalog rules.

This algorithm assumes that a call graph, computed using simple class hierarchy

analysis[83], is available a priori. Heap objects are named by their allocation sites.

The algorithm finds the objects possibly pointed to by each variable and field of heap

objects in the program. Shown in Algorithm 1 is the exact Datalog program, as fed

to bddbddb, that implements Berndl’s algorithm. To keep the first example simple,

we defer the discussion of using types to improve precision until Section 3.1.1.

Algorithm 1 Context-insensitive pointer analysis with a precomputed call graph.

Domains

V 262144 variable.mapH 65536 heap.mapF 16384 field.map

RelationsvP0 (variable : V, heap : H) inputstore (base : V, field : F, source : V) inputload (base : V, field : F, dest : V) inputassign (dest : V, source : V) inputvP (variable : V, heap : H) outputhP (base : H, field : F, target : H) output

Rules

vP(v, h) : – vP0(v, h). (3.1)

vP(v1, h) : – assign(v1, v2), vP(v2, h). (3.2)

hP(h1, f, h2) : – store(v1, f, v2), vP(v1, h1), vP(v2, h2). (3.3)

vP(v2, h2) : – load(v1, f, v2), vP(v1, h1), hP(h1, f, h2). (3.4)

2

A Datalog program has three sections: domains, relations, and rules. A domain

declaration has a name, a size n, and an optional file name that provides a name for

each element in the domain, internally represented as an ordinal number from 0 to

n − 1. The latter allows bddbddb to communicate with the users with meaningful

names. A relation declaration has an optional keyword specifying whether it is an


input or output relation, the name of the relation, and the name and domain of every

attribute. A relation declared as neither input nor output is a temporary relation

generated in the analysis but not written out. Finally, the rules follow the standard

Datalog syntax. The rule numbers, introduced here for the sake of exposition, are

not in the actual program.

We can express all information found in the intermediate representation of a pro-

gram as relations. To avoid inundating readers with too many definitions all at once,

we define the relations as they are used. The domains and relations used in Algo-

rithm 1 are:

V is the domain of variables. It represents all the allocation sites, formal param-

eters, return values, thrown exceptions, cast operations, and dereferences in

the program. There is also a special global variable for use in accessing static

variables.

H is the domain of heap objects. Heap objects are named by the invocation sites

of object creation methods. To increase precision, we also statically identify

factory methods and treat them as object creation methods.

F is the domain of field descriptors in the program. Field descriptors are used when

loading from a field (v2 = v1.f;) or storing to a field (v1.f = v2;). There is a

special field descriptor to denote an array access.

vP0: V × H is the initial variable points-to relation extracted from object allocation

statements in the source program. vP0(v, h) means there is an invocation site

h that assigns a newly allocated object to variable v.

store: V × F × V represents store statements. store(v1, f, v2) says that there is a

statement “v1.f = v2;” in the program.

load : V × F × V represents load statements. load(v1, f, v2) says that there is a

statement “v2 = v1.f;” in the program.

assign: V × V is the assignments relation due to passing of arguments and return

values. assign(v1, v2) means that variable v1 includes the points-to set of variable


v2. Although we do not cover exceptions here, they work in an analogous

manner.

vP : V × H is the output variable points-to relation. vP(v, h) means that variable v

can point to heap object h.

hP : H × F × H is the output heap points-to relation. hP(h1, f, h2) means that field

f of heap object h1 can point to heap object h2.

Note that in our implementation, local variables and their assignments are factored

away using a flow-sensitive analysis[252]. The assign relation is derived by using a

precomputed call graph. The sizes of the domains are determined by the number of

variables, heap objects, and field descriptors in the input program.

Rule (3.1) incorporates the initial variable points-to relations into vP . Rule (3.2)

finds the transitive closure over inclusion edges. If v1 includes v2 and variable v2

can point to object h, then v1 can also point to h. Rule (3.3) models the effect of

store instructions on heap objects. Given a statement “v1.f = v2;”, if v1 can point

to h1 and v2 can point to h2, then h1.f can point to h2. Rule (3.4) resolves load

instructions. Given a statement “v2 = v1.f;”, if v1 can point to h1 and h1.f can

point to h2, then v2 can point to h2. Applying these rules until the results converge

finds all the possible context-insensitive points-to relations in the program.

3.1.1 Improving Pointer Analysis with Types

Because Java is type-safe, variables can only point to objects of assignable types.

Assignability is similar to the subtype relation, with allowances for interfaces, null

values, and arrays[154]. By dropping targets of unassignable types in assignments

and load statements, we can eliminate many impossible points-to relations that result

from the imprecision of the analysis.1

Adding type filtering to Algorithm 1 is simple in Datalog. We add a new domain

1Dropping targets on unassignable types on store statements is unnecessary as because all storesmust go through variables of the appropriate type in any case.


to represent types and new relations to represent assignability as well as type dec-

larations of variables and heap objects. We compute the type filter and modify the

rules in Algorithm 1 to filter out unsafe assignments and load operations.

Algorithm 2 Context-insensitive pointer analysis with type filtering.

DomainsDomains from Algorithm 1, plus:

T 4096 type.map

RelationsRelations from Algorithm 1, plus:

vT (variable : V, type : T) inputhT (heap : H, type : T) inputaT (supertype : T, subtype : T) inputvPfilter (variable : V, heap : H)

Rules

vPfilter(v, h) : – vT (v, tv), hT (h, th), aT (tv, th). (3.5)

vP(v, h) : – vP0(v, h). (3.6)

vP(v1, h) : – assign(v1, v2), vP(v2, h), vPfilter(v1, h). (3.7)

hP(h1, f, h2) : – store(v1, f, v2), vP(v1, h1), vP(v2, h2). (3.8)

vP(v2, h2) : – load(v1, f, v2), vP(v1, h1), hP(h1, f, h2), vPfilter(v2, h2). (3.9)

2

T is the domain of type descriptors (i.e. classes) in the program.

vT : V × T represents the declared types of variables. vT (v, t) means that variable

v is declared with type t.

hT : H × T represents the types of objects created at a particular creation site. In

Java, the type created by a new instruction is usually known statically.2 hT (h, t)

means that the object created at h has type t.

2The type of a created object may not be known precisely if, for example, the object is returned bya native method or reflection is used. Such objects are modeled conservatively as being all possibletypes.


aT : T × T is the relation of assignable types. aT (t1, t2) means that type t2 is

assignable to type t1.

vPfilter : V × H is the type filter relation. vPfilter(v, h) means that it is type-safe

to assign heap object h to variable v.

Rule (3.5) in Algorithm 2 defines the vPfilter relation: It is type-safe to assign

heap object h of type th to variable v of type tv if tv is assignable from th. Rules

(3.6) and (3.8) are the same as Rules (3.1) and (3.3) in Algorithm 1. Rules (3.7) and

(3.9) are analogous to Rules (3.2) and (3.4), with the additional constraint that only

points-to relations that match the type filter are inserted.

3.2 Call Graph Discovery

The call graph generated using class hierarchy analysis can have many spurious call

targets, which can lead to many spurious points-to relations[151]. We can get more

precise results by creating the call graph on the fly using points-to relations. As the

algorithm generates points-to results, they are used to identify the receiver types of

the methods invoked and to bind calls to target methods; and as new methods and

call graph edges are discovered, we use them to find more points-to relations. The

algorithm converges when no new call targets and no new pointer relations are found.

Modifying Algorithm 2 to discover call graphs on the fly is simple. Instead of an

input assign relation computed from a given call graph, we derive it from method in-

vocation statements and points-to relations. We also keep track of the set of reachable

methods and only include statements from reachable methods.

I is the domain of invocation sites in the program. An invocation site is a method

invocation of the form r = p0.m(p1 . . . pk). Note that H ⊆ I.

N is the domain of method names used in invocations. In an invocation r = p0.n(p1

. . . pk), n is the method name.

M is the domain of implemented methods in the program. It does not include

abstract or interface methods.


Algorithm 3 Context-insensitive pointer analysis that computes call graph on the fly.


I 32768 invoke.mapN 4096 name.mapM 16384 method.mapZ 256

RelationsRelations vT , hT , aT , vPfilter , assign from Algorithm 2, plus:

Nstmt (method : M, dest : V, heap : H) inputAstmt (method : M, dest : V, source : V) inputSstmt (method : M, base : V,field : F, source : V) inputLstmt (method : M, base : V,field : F, dest : V) inputcha (type : T,name : N, target : M) inputactual (invoke : I, param : Z, var : V) inputformal (method : M, param : Z, var : V) inputIE 0 (invoke : I, target : M) inputmI (method : M, invoke : I,name : N) inputreach (method : M)IE (invoke : I, target : M) output

RulesRules 3.5, 3.7 from Algorithm 2, plus:

reach(m) : – m = “main”. (3.10)

reach(m) : – IE ( ,m). (3.11)

vP(v, h) : – reach(m),Nstmt(m, v, h). (3.12)

assign(v1, v2) : – reach(m),Astmt(v1, v2). (3.13)

hP(h1, f, h2) : – reach(m),Sstmt(m, v1, f, v2), vP(v1, h1), vP(v2, h2). (3.14)

vP(v2, h2) : – reach(m),Lstmt(m, v1, f, v2), vP(v1, h1), hP(h1, f, h2),vPfilter(v2, h2). (3.15)

IE (i,m2) : – reach(m1),mI (m1, i, 0), IE 0(i,m2). (3.16)

IE (i,m2) : – reach(m1),mI (m1, i, n), actual(i, 0, v), vP(v, h), hT (h, t),cha(t, n,m2). (3.17)

assign(v1, v2) : – IE (i,m), formal(m, z, v1), actual(i, z, v2). (3.18)

2


Z is the domain used for numbering parameters.

Nstmt ,Astmt , Sstmt , and Lstmt are analogous to vP0, assign, store, and load respec-

tively, but include an extra method attribute to signify which method contains

the statement. This is used so that only the statements that are in reachable

methods are considered in the analysis.

cha: T × N × M encodes virtual method dispatch information from the class hier-

archy. cha(t, n,m) means that m is the target of dispatching the method name

n on type t.

actual : I × Z × V encodes the actual parameters for invocation sites. actual(i, z, v)

means that v is passed as parameter number z at invocation site i.

formal : M × Z × V encodes formal parameters for methods. formal(m, z, v) means

that formal parameter z of method m is represented by variable v.

IE 0: I × M are the initial invocation edges. They record the invocation edges

whose targets are statically bound. In Java, some calls are static or non-virtual.

Additionally, local type analysis combined with analysis of the class hierarchy

allows us to determine that some calls have a single target[83]. IE 0(i, m) means

that invocation site i can be analyzed statically to call method m.

mI : M × I × N represents invocation sites. mI (m, i, n) means that method m con-

tains an invocation site i with virtual method name n. Non-virtual invocation

sites are given a special null method name, which does not appear in the cha

relation.

reach: M is the set of reachable methods. It starts as the main method and grows as

more and more of the call graph is discovered. The actual implementation in-

cludes some other root methods, such as class initializers, finalizers, and thread

entry points.

IE : I × M is an output relation encoding all invocation edges. IE (i, m) means that

invocation site i calls method m.


Rules (3.10) and (3.11) are the base case and inductive step, respectively, to com-

pute the set of reachable methods. Rules (3.12), (3.13), (3.14), and (3.15) incorporate

new, assignment, store, and load statements from reachable methods.

Rules (3.16) and (3.17) find the invocation edges, with the former handling stat-

ically bound targets and the latter handling virtual calls. Rule (3.17) matches invo-

cation sites with the type of the “this” pointer and the class hierarchy information

to find the possible target methods. If an invocation site i with method name n is

invoked on variable v, and v can point to h and h has type t, and invoking n on type

t leads to method m, then m is a possible target of invocation i.

Rule (3.18) handles parameter passing.3 If invocation site i has a target method

m, variable v2 is passed as argument number z, and the formal parameter z of method

m is v1, then the points-to set of v1 includes the points-to set of v2. Return values are

handled in a likewise manner, only the inclusion relation is in the opposite direction.

We see that as the discovery of more variable points-to (vP) can create more invo-

cation edges (IE ), which in turn can create more reachable methods (reach), more

assignments (assign) and more points-to relations. The algorithm converges when all

the relations stabilize.

3.3 Context-Sensitive Pointer Analysis

A context-insensitive or monomorphic analysis produces just one set of results for

each method regardless how many ways a method may be invoked. This leads to

imprecision because information from different calling contexts must be merged, so

information along one calling context can propagate to other calling contexts. A

context-sensitive or polymorphic analysis avoids this imprecision by allowing different

contexts to have different results.

We can make a context-sensitive version of a context-insensitive analysis as follows.

We make a clone of a method for each path through the call graph, linking each call

site to its own unique clone. We then run the original context-insensitive analysis over

3We also match thread objects to their corresponding run() methods, even though the edges donot explicitly appear in the call graph.


the exploded call graph. However, this technique can require an exponential (and in

the presence of cycles, potentially unbounded) number of clones to be created.

It has been observed that different contexts of the same method often have many

similarities. For example, parameters to the same method often have the same types

or similar aliases. This observation led to the concept of partial transfer functions

(PTF), where summaries for each input pattern are created on the fly as they are

discovered[257, 259]. However, PTFs are notoriously difficult to implement and get

correct, as the programmer must explicitly calculate the input patterns and manage

the summaries. Furthermore, the technique has not been shown to scale to very large

programs.

Our approach is to allow the exponential explosion to occur and rely on the un-

derlying BDD representation to find and exploit the commonalities across contexts.

BDDs can express large sets of redundant data in an efficient manner[70]. Contexts

with identical information will automatically be shared at the data structure level.

Furthermore, because BDDs operate down at the bit level, it can even exploit com-

monalities between contexts with different information. BDD operations operate on

entire relations at a time, rather than one tuple at a time. Thus, the cost of BDD op-

erations depends on the size and shape of the BDD relations, which depends greatly

on the variable ordering, rather than the number of tuples in a relation. Also, due

to caching in BDD packages, identical subproblems only have to be computed once.

Thus, with the right variable ordering, the results for all contexts can be computed

very efficiently.

3.3.1 Numbering Call Paths

A call path is a sequence of invocation edges (i1, m1), (i2, m2), . . . , such that i1 is an

invocation site in an entry method, typically main4, and ik is an invocation site in

method mk−1 for all k > 1.

For programs without recursion, every call path to a method defines a context for

that method. To handle recursive programs, which have an unbounded number of call

4Other “entry” methods in typical programs are static class initializers, object finalizers, andthread run methods.


paths, we first find the strongly connected components (SCCs) in a call graph. By

eliminating all method invocations whose caller and callee belong to the same SCC

from the call paths, we get a finite set of reduced call paths. Each reduced call path

to an SCC defines a context for the methods in the SCC. Thus, information from

different paths leading to the SCCs are kept separate, but the methods within the

SCC invoked with the same incoming call path are analyzed context-insensitively.

M1

M4 M4 M5 M4 M4 M5

M6 M6 M6 M6 M6 M6

M1

M2 M3M3 M2

11

1 2

1−2 1−2

3−41−2 1−2

1−2

1−21−4

5−61−4

M M

M M

M

4

6

5

32

a b

f g

dc

e

h i h h i h h i

1 3 2 4 65

1

(a) (b)

23 1 42

a b21

e gf e f g

Figure 3.1: Example of path numbering. The graph on the left is the original graph.Nodes M2 and M3 are in a cycle and therefore are placed in one equivalence class.Each edge is marked with path numbers at the source and target of the edge. Thegraph on the right is the graph with all of the paths expanded.

Call paths Reduced call pathsreaching M6 reaching M6

a(cd)∗eh aehb(dc)∗deh beha(cd)∗cfh afhb(dc)∗fh bfha(cd)∗cgi agib(dc)∗gi bgi

Figure 3.2: The six contexts of function M6 in Example 1


Example 3 Figure 3.1(a) shows a small call graph with just six methods and a

set of invocation edges. Each invocation edge has a name, being one of a through i;

its source is labeled by the context number of the caller and its sink by the context

number of the callee. The numbers will be explained in Example 2. Methods M2 and

M3 belong to a strongly connected component, so invocations along edges c and d

are eliminated in the computation of reduced call graphs. While there are infinitely

many call paths reaching method M6, there are only six reduced call paths reaching

M6, as shown in Figure 3.2. Thus M6 has six clones, one for each reduced call path.

2

Under this definition of context sensitivity, large programs can have many con-

texts. For example, pmd from our test programs has 1971 methods and 1023 contexts!

In the BDD representation, we give each reduced call path reaching a method a

distinct context number. It is important to find a context numbering scheme that al-

lows the BDDs to share commonalities across contexts. Algorithm 4 shows one such

scheme.

Example 4 We now show the results of applying Algorithm 4 to Example 1. M1,

the root node, is given context number 1. We shall visit the invocation edges from

left to right. Nodes M2 and M3, being members of a strongly connected component,

are represented as one node. The strongly connected component is reached by two

edges from M1. Since M1 has only one context, we create two clones, one reached by

each edge. For method M4, the predecessor on each of the two incoming edges has

two contexts, thus M4 has four clones. Method M5 has two clones, one for each clone

that invokes M5. Finally, method M6 has six clones: Clones 1-4 of method M4 invoke

clones 1-4 and clones 1-2 of method M5 call clones 5-6, respectively. The cloned graph

is shown in Figure 3.1(b). 2

The numbering scheme used in Algorithm 4 plays up the strengths of BDDs. Each

method is assigned a contiguous range of contexts, which can be represented efficiently

in BDDs. The contexts of callees can be computed simply by adding a constant to the

contexts of the callers; this operation is also cheap in BDDs. Because the information

for contexts that share common tail sequences are likely to be similar, this numbering


Algorithm 4 Generating context-sensitive invocation edges from a call graph.

Input: A call multi-graph.

Output: Context-sensitive invocation edges IE c: C × I × C × M, where C is thedomain of context numbers. IE c(c, i, cm, m) means that invocation site i in contextc calls method m in context cm.

Method:

1. A method with n clones will be given numbers 1, . . . , n. Nodes with no prede-cessors are given a singleton context numbered 1.

2. Find strongly connected components in the input call graph. The ith clone ofa method always calls the ith clone of another method belonging to the samecomponent.

3. Collapse all methods in a strongly connected component to a single node to getan acyclic reduced graph.

4. For each node n in the reduced graph in topological order,

Set the counts of contexts created, c, to 0.For each incoming edge,

If the predecessor of the edge p has k contexts,create k clones of node n,Add tuple (i, p, i + c, n) to IE c, for 1 ≤ i ≤ k,c = c + k.

2


allows the BDD data structure to share effectively across common contexts. For

example, the sequentially-numbered clones 1 and 2 of M6 both have a common tail

sequence eh. Because of this, the contexts are likely to be similar and therefore the

BDD can take advantage of the redundancies.

To optimize the creation of the cloned invocation graph, we have defined a new

primitive that creates a BDD representation of contiguous ranges of numbers in O(k)

operations, where k is the number of bits in the domain. In essence, the algorithm

creates one BDD to represent numbers below the upper bound, and one to represent

numbers above the lower bound, and computes the conjunction of these two BDDs.

3.3.2 Context-Sensitive Pointer Analysis with a Precom-

puted Call Graph

We are now ready to present our context-sensitive pointer analysis. We assume

the presence of a precomputed call graph created, for example, by using a context-

insensitive pointer analysis (Algorithm 3). We apply Algorithm 4 to the call graph

to generate the context-sensitive invocation edges IE c. Once that is created, we can

simply apply a context-insensitive pointer analysis on the exploded call graph to get

context-sensitive results. We keep the results separate for each clone by adding a

context number to methods, variables, invocation sites, points-to relations, etc.

C is the domain of context numbers. Our BDD library uses signed 64-bit integers

to represent domains, so the size is limited to 263.

IE c: C × I × C × M is the set of context-sensitive invocation edges. IE c(c, i, cm, m)

means that invocation site i in context c calls method m in context cm. This

relation is computed using Algorithm 4.

assignc: C × V × C × V is the context-sensitive version of the assign relation.

assignc(c1, v1, c2, v2) means variable v1 in context c1 includes the points-to set

of variable v2 in context v2 due to parameter passing. Again, return values are

handled analogously.


Algorithm 5 Context-sensitive pointer analysis with a precomputed call graph.


C 9223372036854775808


IE c (caller : C, invoke : I, callee : C, tgt : M) inputassignc (destc : C, dest : V, srcc : C, src : V)vPc (context : C, variable : V, heap : H) output

Rules


vPc(c, v, h) : – vP0(v, h), IE c(c, h, , ). (3.20)

vPc(c1, v1, h) : – assignc(c1, v1, c2, v2), vPc(c2, v2, h), vPfilter(v1, h). (3.21)

hP(h1, f, h2) : – store(v1, f, v2), vPc(c, v1, h1), vPc(c, v2, h2). (3.22)

vPc(c, v2, h2) : – load(v1, f, v2), vPc(c, v1, h1), hP(h1, f, h2),vPfilter(v2, h2). (3.23)

assignc(c1, v1, c2, v2) : – IE c(c2, i, c1, m), formal(m, z, v1), actual(i, z, v2). (3.24)

2

vPc: C × V × H is the context-sensitive version of the variable points-to relation

(vP). vPc(c, v, h) means variable v in context c can point to heap object h.

Rule (3.24) interprets the context-sensitive invocation edges to find the bindings

between actual and formal parameters. The rest of the rules are the context-sensitive

counterparts to those found in Algorithm 2.

Algorithm 5 takes advantage of a precomputed call graph to create an efficient

context numbering scheme for the contexts. We can compute the call graph on the

fly while enjoying the benefit of the numbering scheme by numbering all the possible

contexts with a conservative call graph, and delaying the generation of the invocation

edges only if warranted by the points-to results. We can reduce the iterations neces-

sary by exploiting the fact that many of the invocation sites of a call graph created


by a context-insensitive pointer analysis have single targets. Such an algorithm has

an execution time similar to Algorithm 5, but is of primarily academic interest as the

call graph rarely improves due to the extra precision from context-sensitive points-to

information.

3.4 Object-Sensitive Pointer Analysis

The notion of cloning is a general mechanism for achieving context sensitivity. We

can use the same idea to implement other types of context sensitivity. One example is

object sensitivity[169, 171], in which the context is the value of the receiver object (the

this pointer). Object sensitivity can be quite accurate for object-oriented languages

such as Java because the receiver object can be a better indicator of context than the

call site. It also has the added benefit that the number of contexts is much smaller

than with call-path-based context sensitivity.

Algorithm 6 gives an object-sensitive analysis with call graph discovery. It com-

bines ideas from the call graph discovery Algorithm 3 and the context-sensitive Al-

gorithm 5. As in Algorithm 3, we keep a set of reachable methods and only include

relations from methods that are reachable. As in Algorithm 5, we keep separate re-

sults for each clone by adding context numbers to the appropriate relations; however,

in this case the context numbers are given by the receiver object, rather than the call

path number.

Algorithm 6 contains another notable difference. Instead of naming objects by

their creation site, we also add the context of the object that created them. This

allows us to support k-object sensitivity[171], where an object is named not only by

its creation site, but by k levels of enclosing objects.

The object naming is encoded in the objName relation. It is generated a priori

much like the IE c relation. We build a graph with a node for each creation site in

the program, and add an edge between two nodes h1 → h2 if an object of the type of

h1 can create an object of the type of h2. We then count the paths over that graph,

but instead of collapsing cycles to limit numbering, we use a k-CFA approach to limit

the names to the last k−1 edges. In 1-object sensitivity, no edges are considered and

the object name is simply the creation site.


Algorithm 6 Object-sensitive pointer analysis with call-graph discovery.


O 1048576

RelationsRelations from Algorithm 3, plus IE c, assignc from Algorithm 5, plus:

objName (context : O, heap : H, object : O) inputreach (context : O,method : M)vPofilter (variable : V, object : O)vPo (context : O, variable : V, object : O) outputhP (base : O, field : F, target : O) output

Rules

vPofilter(v, o) : – vT (v, tv), hT (h, th), aT (tv, th), objName( , h, o). (3.25)

reach(0, m) : – m = “main”. (3.26)

reach(c, m) : – IE c( , , c, m). (3.27)

vPo(c, v, o) : – reach(c, m),Nstmt(m, v, h), objName(c, h, o). (3.28)

assignc(c2, v2, c1, v1) : – reach(c1, m),Astmt(m, v1, v2), c1 = c2. (3.29)

hP(o1, f, o2) : – reach(c, m), Sstmt(m, v1, f, v2), vPo(c, v1, o1),vPo(c, v2, f, o2). (3.30)

vPo(c, v2, o2) : – reach(c, m),Lstmt(m, v1, f, v2), vPo(c, v1, o1),hP(o1, f, o2), vPofilter(v2, o2). (3.31)

IE c(c1, i, c2, m2) : – reach(c1, m1),mI (m1, i, 0), IE 0(i, m2), c1 = c2. (3.32)

IE c(c1, i1, c2, m2) : – reach(c1, m1),mI (m1, i1, n), actual(i1, 0, v), vPo(c1, v, o),objName( , h, o), hT (h, t), cha(t, n,m2), c2 = o. (3.33)

assignc(c2, v2, c1, v1) : – actual(i, z, v1), IE c(c1, i, c2, m), formal(m, z, v2). (3.34)

vPo(c1, v1, h) : – assignc(c1, v1, c2, v2), vPo(c2, v2, o), vPofilter(v1, o). (3.35)

2


O is the domain of object names. This is similar to the H domain of creation

sites, but includes the extra precision of k-object sensitivity. In object-sensitive

analysis, object names are also used as the method context; the context values

are the different values of the receiver object.

objName: O × H × O maps from creation sites to object names. This is used

to implement k-object sensitivity. Given the first two attributes of base object

name and creation site, objName returns the object name in the third attribute.

It is generated a priori using the algorithm described above.

reach: O × M is the set of reachable contexts and methods. It differs from the

previous reach in that it adds a context attribute.

vPofilter : V × O is the type filter relation, modified to operate on object names

rather than creation sites.

vPo: O × V × O is the context-sensitive version of the variable points-to relation

(vP), modified to refer to object names rather than creation sites. vPo(c, v, o)

means variable v in context c can point to object o.

hP : O × F × O is the output heap points-to relation, modified to operate on object

names.

The Datalog rules are similar to those found in Algorithms 3 and 5, but modified

to operate on object names instead of creation sites. The main differences are in

the rules to handle matching invocation edges. Rule (3.33) sets the callee context to

match the base object on which the method is invoked. Rule (3.32) handles static

method invocations, where the context of the callee is the same as the context of the

caller.

Owing to the efficient BDD implementation, k-object sensitivity in bddbddb

is able to scale up to much larger programs and larger values of k than other

implementations[177]. Our Datalog “implementation” is also far shorter and sim-

pler than other traditional implementations, and far easier to modify and extend.

Chapter 4

bddbddb: BDD-Based Deductive

Database

This chapter describes the design and implementation of the bddbddb (BDD-Based

Deductive DataBase) system. bddbddb is an implementation of Datalog that uses

binary decision diagrams (BDDs) to represent relations. As such, it is able to operate

on very large relations efficiently. bddbddb includes many optimizations to make it

an effective Datalog solver. It was originally designed and tuned for large program

analysis analysis problems such as context-sensitive pointer analysis; however it turns

out bddbddb also works well for other types of problems.

bddbddb operates in stages, much like a traditional compiler. It first translates

the Datalog into an intermediate representation consisting of primitive relational al-

gebra operations, performs optimizations on that intermediate representation, and

from there translates into BDD operations and assigns physical BDD domains. Sec-

tion 4.1 covers the basics of the translation from Datalog to BDD operations, and

Section 4.2 covers the optimizations bddbddb performs on the intermediate represen-

tation. Section 4.3 gives a machine learning algorithm developed with Michael Carbin

for discovering BDD variable orders, one of the most important problems when us-

ing BDDs. Section 4.4 presents techniques for querying results in bddbddb including

reverse execution using the “come-from” query. The come-from query gives an exact

trace as to why a particular tuple is added to a relation.

53

CHAPTER 4. BDDBDDB: BDD-BASED DEDUCTIVE DATABASE 54

4.1 From Datalog to BDD Operations

In this section, we explain our rationale for using BDD operations to solve Datalog

programs. We first show how a Datalog program can be translated into relational

algebra operations. We then show how we represent relations as boolean functions

and relational algebra as operations on boolean functions. Finally, we show how

boolean functions can be represented efficiently as binary decision diagrams (BDDs).

4.1.1 Relational Algebra

A Datalog query with finite domains and stratified negation can be solved by ap-

plying sequences of relational algebra operations corresponding to the Datalog rules

iteratively, until a fix-point solution is reached. We shall illustrate this translation

simply by way of an example, since it is relatively well understood.

We use the following set of relational operations: join, union, project, rename,

difference, and select. R1 1 R2 denotes the natural join of relations R1 and R2,

which returns a new relation where tuples in R1 have been merged with tuples in R2

in which corresponding attributes have equal values. R1 ∪ R2 denotes the union of

relations R1 and R2, which returns a new relation that contains the union of the sets

of tuples in R1 and R2. πa1,...,ak(R) denotes the project operation, which forms a new

relation by removing attributes a1, . . . , ak from tuples in R. ρa→a′(R) denotes the

rename operation, which returns a new relation with the attribute a of R renamed to

a′. R1 − R2 denotes the difference of relations R1 and R2, which contains the tuples

that are in R1 but not in R2. The select operation, denoted as σa=c(R), restricts

attribute a to match a constant value c. It is equivalent to performing a natural join

with a unary relation consisting of a single tuple with attribute a holding value c.

To illustrate, an application of the rule

vP(v1, h) : – assign(v1, v2), vP(v2, h).

corresponds to this sequence of relational algebra operations:

t1 = ρvariable→source(vP);


t2 = assign 1 t1;

t3 = πsource(t2);

t4 = ρdest→variable(t3);

vP = vP ∪ t4;

Note that rename operations are inserted before join, union, or difference op-

erations to ensure that corresponding attributes have the same name, while non-

corresponding attributes have different names.

4.1.2 Boolean Functions

We encode relations as boolean functions over tuples of binary values. Elements in

a domain are assigned consecutive numeric values, starting from 0. A value in a

domain with m elements can be represented in dlog2(m)e bits. Suppose each of the

attributes of an n-ary relation R is associated with numeric domains D1, D2, . . . , Dn,

respectively. We can represent R as a boolean function f : D1×. . .×Dn → {0, 1} such

that (d1, . . . , dn) ∈ R iff f(d1, . . . , dn) = 1, and (d1, . . . , dn) /∈ R iff f(d1, . . . , dn) = 0.

Let relation R be a set of tuples {(1, 1), (2, 0), (2, 1), (3, 0), (3, 1)} over D1 ×D2,

where D1 = {0, 1, 2, 3} and D2 = {0, 1}. The binary encoding for R is function f ,

displayed in Figure 2.2(a), where the first attribute of R is represented by bits b1 and

b2 and the second attribute by b3.

For each relational algebra operation, there is a logical operation that produces

the same effect when applied to the corresponding binary function representation.

Suppose R1 is represented by function f1 : D1 × D2 → {0, 1} and R2 by function

f2 : D2 × D3 → {0, 1}. The relation R1 1 R2 is represented by function f3 :

D1 × D2 × D3 → {0, 1}, where f3(d1, d2, d3) = f1(d1, d2) ∧ f2(d2, d3). Similarly,

the union operation maps to the binary ∨ operator, and l − r ≡ l ∧ ¬r. The project

operation can be represented using existential quantification. For example, πa2(R1)

is represented by f : D1 → {0, 1} where f(d1) = ∃d2.f1(d1, d2).


4.1.3 BDD Operations

The boolean function operations discussed in Section 4.1.2 are a standard feature

of BDD libraries[32, 153, 172, 215, 226, 248, 262]. The ∧ (and), ∨ (or), and −(difference) boolean function operations can be applied to two BDDs, producing a

BDD of the resulting function. The BDD existential quantification operation exist

is used to produce a new BDD where nodes corresponding to projected attributes are

removed. This operation combines the low and high successors of each removed node

by applying an ∨ operation.

Rename operations are implemented using the BDD replace operation, which

computes a new BDD where decision variables corresponding to the old attributes

have been replaced with decision variables corresponding to the new attribute names.

Replace operations can be eliminated if the renamed attributes are encoded using

the same decision variables as the original attributes. A replace operation which

does not change the relative order of decision variables is only linear with respect

to the number of nodes in the BDD. If the order is changed, the cost of a replace

can be exponential with respect to the number of decision variables. Care must be

taken when encoding relation attributes to minimize the number of expensive rename

operations.

Natural join operations are frequently followed by project operations to eliminate

unnecessary attributes. The BDD relational product operation, or relprod, effi-

ciently combines this sequence in a single operation. Similarly, the select and project

operations can be combined into a single BDD operation, known as restrict.

BDD operations operate on entire relations at a time, rather than one tuple at a

time. The cost of BDD operations depends on the size and shape of the BDD graphs,

not the number of tuples in a relation. Thus, large relations can be computed quickly

as long as their encoded BDD representations are compact. Also, due to caching in

BDD packages, identical subproblems only have to be computed once. These points

are key to the efficiency of BDD operations, and are the reason why we use this data

structure to represent our relations.


4.2 Translating and Optimizing Datalog Programs

The bddbddb system applies a large number of optimizations to transform Datalog

programs into efficient BDD operations:

1. Apply Datalog source level transforms and optimizations. (Section 4.2.1)

2. Remove unnecessary rules, stratify the rules, and determine the rule iteration

order. (Section 4.2.2)

3. Translate the stratified query into an intermediate representation (IR) consisting

of relational algebra operations. (Section 4.2.3)

4. Through analysis, optimize the IR and add BDD operations to replace equiva-

lent sequences of relational algebra operations. (Section 4.2.4)

5. Choose BDD decision variables for encoding relation attributes. (Section 4.2.5)

6. Perform more dataflow optimizations after physical domains have been assigned.

(Section 4.2.6)

7. Interpret the resulting program. (Section 4.2.7)

To illustrate this process, we use Algorithm 1 from Section 3.1 as a running ex-

ample.

4.2.1 Datalog Source Transformations

Before compilation, we normalize the forms of the input rules as follows:

• Any variable that appears only once in a rule is changed into an underscore ( )

to indicate an unused attribute.

• If a variable appears multiple times in a single subgoal, we give each additional

use a distinct name, and then add extra equality subgoals to make the new

variables equal to the original variable. For example, a subgoal R(x, x, x) is

transformed into the three subgoals R(x, x′, x′′), x = x′, x = x′′.


vP0

vP

hP

store

load2

1

4

324

3

assign

(a)

vP0 vP0

assign assign

vP

hP

store

load

vP

hP

store

load

1

3

45

1

23

45

6

7

2

(b)

Figure 4.1: (a) Predicate dependency graph for Algorithm 1. (b) Breaking the PDGinto SCCs and finding cycles.

• Each comparison subgoal with an attribute of domain D is substituted with a

subgoal for a corresponding precomputed relation defined over D × D which

represents that comparison function.

• Subgoals in rules that define temporary relations are inlined into the rules that

use those relations. Temporary relations are non-input, non-output relations

which are in the head of only one rule, and appear as a subgoal in only one

other rule.

4.2.2 Datalog Rule Optimization

Rule Removal

The solver removes rules and relations that do not indirectly contribute to the out-

put relations. A predicate dependency graph (PDG) is built to record dependencies

between rules and relations. Each node represents a relation, and there is an edge

g → h marked with rule r if rule r has subgoal relation g and head relation h. (If the

subgoal is negated, the edge is marked as a negative edge.) The PDG for our example

is shown in Figure 4.1(a). Necessary rules and relations are found by performing a

backward pass over the PDG, starting from the output relations.


Stratification

We then use the PDG to stratify the program. Stratification guarantees that the

relation for every negated subgoal can be fully computed before applying rules con-

taining the negation. Each stratum is a distinct subset of program rules that fully

computes relations belonging to that stratum. Rules in a particular stratum may

use the positive forms of relations computed in that stratum, as well as positive or

negated forms of relations calculated in earlier strata and input relations from the

relational database. There are no cyclic dependencies between strata. If the program

cannot be stratified, we warn the user. In our experience designing Datalog programs

for program analysis, we have yet to find a need for non-stratifiable queries.

As our example does not contain any negations, all of the rules and relations are

placed within a single stratum.

Finding cycles

Cycles in the PDG indicate that some rules and relations are recursively defined,

requiring iterative application of rules within the cycles to reach a fixed-point solution.

The PDG for each stratum is split into strongly connected components (SCCs). We

can compute the result for a stratum by evaluating strongly connected components

and non-cyclic relations in the topological order of the PDG.

A single strongly connected component can encompass multiple loops that share

the same header node. We would like to distinguish between the different loops in

a single SCC so we can iterate around them independently. However, the PDG is

typically not reducible, and the classical algorithm for finding loops—Tarjan’s interval

finding algorithm—only works on reducible graphs[231]. Extensions have been made

to deal with irreducible graphs, but they typically have the property that a node can

only be the header for one loop[195]. We solve this by identifying one loop in the

SCC, eliminating its back edge, and then recursively re-applying the SCC algorithm

on the interior nodes to find more inner loops.

The steps of the algorithm on our example are shown in Figure 4.1(b). We first

break the PDG into five SCCs, labeled 1-5, as shown on the left. Then, we remove the


edge for rule 4 from hP to vP, breaking the larger cycle so that it can topologically

sort those nodes and find the smaller self-cycle on vP for rule 2, as shown on the right.

Determining rule application order

The order in which the rules are applied can make a significant difference in the

execution time. When there are multiple cycles in a single SCC, the number of rule

applications that are necessary to reach a fix-point solution can differ based on the

relative order in which the two cycles are iterated. Which application order will yield

the fewest number of rule applications depends not only on the rules but also on the

nature of the relations.

Aspects of the BDD library can also make certain iteration orders more efficient

than others, even if they have more rule applications. For example, the BDD library

uses an operation cache to memoize the results of its recursive descents on BDD

nodes, so it can avoid redundant computations when performing an operation. This

cache can also provide benefits across different operations if the BDDs that are being

operated upon share nodes. To take advantage of operation cache locality across

operations, one should perform related operations in sequence. Another aspect in-

fluencing iteration order choice is the set-oriented nature of BDD operations. When

performing an operation on tuples generated in a loop, it is ofter faster to apply the

operation after completing all loop iterations, rather than applying it once per loop

iteration.

In the absence of profile information from prior runs or from the user, bddbddb

uses static analysis of the rules to decide upon a rule application order. Cycles that

involve fewer rules are iterated before cycles that involve more rules, and rules that

have fewer subgoals are iterated before rules that have more subgoals. The reasoning

behind this is that smaller, shorter chains of rules and smaller rules are faster to iterate

due to operation cache locality. This static metric works very well in the examples we

have tried because small cycles are usually transitive closure computations, which are

fast and expose more opportunities for set-based computation on the larger cycles.


4.2.3 Intermediate Representation

Once we have determined the iteration order, we translate the rules into an interme-

diate representation based on relational algebra operations as follows:

1. For each subgoal with an underscore, project away its unused attributes.

2. For each subgoal with a constant, use the select and project operators to restrict

the relation to match the constant.

3. Join each subgoal relation with each of the other subgoal relations, projecting

away attributes as they become unnecessary.

4. Rename the attributes in the result to match the head relation.

5. If the head relation contains a constant, use the select operator on the result to

set the value of the constant.

6. Unify the result with the head relation.

4.2.4 IR Optimizations

In repeated applications of a given rule within a loop, it can be more efficient to

make use of the differential between the current value of a subgoal relation and the

previous value from the last time the rule was applied. This is known as incremen-

talization or the semi-naıve evaluation strategy[20]. By computing the difference in

subgoal relations as compared to the previous iteration, we can avoid extra work; if

these inputs are the same as the last iteration, we can avoid applying the rule al-

together. The trade-off of incrementalization is that the old value of every subgoal

in every incrementalized rule must be stored. We allow the user to control whether

incrementalization is performed on a per rule basis. Performing incrementalization on

the sequence of relational algebra operations derived from Rule (3.2) (Section 4.1.1)

generates the following IR:

vP ′′ = vP − vP ′;


vP ′ = vP ;

assign ′′= assign − assign ′;

assign ′= assign;

t1 = ρvariable→source(vP ′′);

t2 = assign 1 t1;

t3 = ρvariable→source(vP);

t4 = assign ′′ 1 t3;

t5 = t2 ∪ t4;

t6 = πsource(t5);

t7 = ρdest→variable(t6);

vP = vP ∪ t7;

Next, we apply a number of traditional compiler data flow optimizations on the

IR:

• Constant propagation. We propagate empty set, universal set, and constants

to reduce unions, joins, and difference operations.

• Definition-use chains. We calculate the chains of definitions and uses and

use this to optimize the program by eliminating dead code (operations whose

results have no uses), coalescing natural join and project pairs into relprod

operations, and coalescing select and project pairs into restrict operations.

After this stage of optimizations, relational algebra operations are replaced by

BDD operations, using combined relprod operations and restrict operations where

possible. Rule (3.2) becomes:

vP ′′ = diff(vP , vP ′);

vP ′ = copy(vP);

t1 = replace(vP ′′, variable → source);

t2 = relprod(t1, assign, source);

t3 = replace(t2, dest → variable);

vP = or(vP , t3);


In the optimized IR, the join-project pair involving assign and vP ′′ has been

collapsed into a single relprod. Also, the operations for computing and using the

difference of assign have been removed because assign is loop invariant.

4.2.5 BDD Decision Variable Assignment

As noted in Section 4.1.3, the use of BDD operations to implement relational op-

erations places constraints on the choice of BDD decision variables used to encode

relation attributes. When performing an operation on two BDDs, the decision vari-

ables for corresponding attributes must match. Likewise, unmatched attributes must

be assigned to different decision variables. A BDD replace operation is used when-

ever different sets of decision variables must be substituted into a BDD as the result

of a relational rename.

It is most important to minimize the cost of replace operations. This depends

on the choice of decision variables used for encoding each attribute. The cost can be

zero, linear, or exponential depending on whether the new decision variables are the

same, have the same relative order, or have a different relative order. Additionally,

we prefer to perform costly replace operations on smaller BDDs (in terms of BDD

nodes) rather than on larger BDDs.

bddbddb uses a priority-based constraint system to assign attributes to BDD de-

cision variables. This system is expressed in terms of both equivalence and non-

equivalence constraints on relation attributes and sequences of decision variables.

We use a specialized union-find data structure augmented with non-equivalence con-

straints to efficiently compute the constraint system. In BDD terminology, a sequence

of binary decision variables used to represent an attribute is often referred to as a

physical domain, which should not be confused with a Datalog domain as defined in

Section 2.3.

We avoid introducing replace operations by constraining any renamed attributes

to use the same physical domain as the original attribute. When an introduced

constraint would cause the constraint system to have no solution, we assign the new

attribute to a different physical domain and add a replace operation at that point


to allow the constraint to be satisfied. By carefully choosing the order of priority in

which constraints are added to the system, we ensure that replace operations are

introduced where they will be most efficient.

For each attribute a in relation R, we create a non-equivalence constraint between

a and other attributes in R. Then, we add constraints for all program operations, in

order of importance. Operations in inner loops have higher importance than opera-

tions in outer loops, under the presumption that these operations will be performed

more often. Within a given loop depth, relprod operations are considered first,

in order of execution, since they are typically the most expensive operations. Af-

ter relprod operations, we consider other operations. For a unary operation such

as copy, we create equivalence constraints between corresponding attributes of the

source and destination relations. For a binary operation, the interacting attributes

for the input relations are constrained to be equal. After considering all operations,

we add constraints for the attributes of the input and output relations. The physical

domains used by these relations are specified by the user, since they must be loaded

from or stored into the relational database.

An application of the physical domain assignment algorithm to our running exam-

ple reveals that variable from vP ′′ and source from assign can be assigned to the same

physical domain for the relprod. Therefore, the replace that occurs immediately

before can be removed:

vP ′′ = diff(vP , vP ′);

vP ′ = copy(vP);

t1 = relprod(vP ′′, assign, source);

t2 = replace(t1, dest[V 1] → variable[V 0]);

vP = or(vP , t2);

4.2.6 Additional Optimizations

After domain assignment, we have the opportunity to apply another set of standard

compiler optimizations:


• Global value numbering. Global value numbering factors the evaluation of

common subexpressions among rules into non-redundant computations. More-

over, it optimizes loops by hoisting invariants.

• Copy propagation. Copy propagation eliminates unnecessary temporary IR

relations that can be generated by our optimizations.

• Liveness analysis. We use a liveness analysis to clean up dead code. We reduce

the memory footprint during IR interpretation by freeing relation allocations as

soon as the lifetime of a relation has ended.

4.2.7 Interpretation

Finally, bddbddb interprets the optimized IR and performs the IR operations in se-

quence by calling the appropriate methods in the BDD library.

4.2.8 Other Considerations

bddbddb automates a number of the issues presented by both Datalog and BDD

programs. However, other issues are either too expansive in nature or too dependent

on the user’s query and inputs. As a result, we have implemented bddbddb with

various aspects of user parameterization to address some issues and we have also

developed separate tools to address others.

Incrementalization

Traditional databases typically always perform incrementalization, as the cost of

database operations is proportional to the number of elements in the inputs. However,

this is not always the case with BDDs; due to the nature of BDDs it can sometimes

be more efficient to use the whole set rather than to compute and operate on the

difference.

Again, whether or not it is beneficial to incrementalize intermediate results de-

pends on a variety of factors including the nature of the relations and BDD caching


effects. Therefore we allow the user to optionally specify which rules should and

should not be incrementalized.

Subgoal Ordering

Subgoal ordering, as with incrementalization, depends heavily on the nature of the

relations and the underlying BDDs. Evaluating subgoals in one order may produce

smaller intermediate results. However, these smaller results may have poor represen-

tations as compared to those of the larger intermediate results generated by other

subgoal orderings. We can skirt around this situation by performing relational prod-

ucts with more than two relations at a time. However, we have observed that this

can lead to poor BDD cache performance. Therefore, in the interest of user query

customization, we rely on the user to specify the intended subgoal ordering in the

Datalog query.

4.3 Learning BDD Variable Orderings

While BDDs have proven effective in compacting the commonalities in large sets

of data, the extent to which these commonalities can be exploited depends on the

ordering of the decision variables. In our case, the difference between a good or bad

ordering can mean the termination or non-termination (due to memory exhaustion)

of an analysis. Moreover, the relative orderings are not readily apparent given only a

static analysis, and the space of all orders is extremely large; with both precedence

and interleaving conditions, the number of orders is given by the series for ordered

Bell numbers, which is exponential.

Finding the optimal order that yields a smallest BDD for a given function is an

NP-complete problem[37]. Despite an extensive amount of work in this area, heuristic

and dynamic reordering techniques have not proved effective in general[196].

The use of BDDs in implementing a relational database does provide a few im-

portant rules of thumb on how variables need to be ordered. However, run time

depends greatly on the characteristics of the inputs, the performance of an operation


can vary significantly during the course of single run, and furthermore, the order-

ing that yields the smallest BDD representation does not necessarily have the best

performance. Thus, we often find these rules not sufficient for optimizing an entire

analysis. By exhaustively exploring a subset of the possible orderings for a small pro-

gram, we found that there exists a small number of nearly best variable orders that

can dramatically outperform the rest. Moreover, the best orderings are not intuitive.

When given the best ordering for an analysis that we have worked on for months, we

can only begin to postulate ex post facto why the ordering works.

Michael Carbin and I have developed an algorithm for finding an effective decision

variable ordering[52]. The algorithm, based on active learning, is embedded in the

execution of Datalog programs in the bddbddb system. When bddbddb encounters a

rule application that takes longer than a parameterized amount of time, it initiates

a learning episode to find a better decision variable ordering by measuring the time

taken for alternative variable orderings. Because rule applications can be expensive,

bddbddb maximizes the effectiveness of each trial by actively seeking out those decision

variable orderings whose effects are least known.

In this section we provide a description of the problem space, a general formulation

of our BDD variable order learning algorithm and details on the integration and

implementation of our algorithm in the bddbddb framework.

4.3.1 The Problem Space

A relation in an analysis typically has from 2 to 8 attributes, each of which may be

allocated up to 60 decision variables. It is not uncommon for a BDD in a program

analysis to use about 200 decision variables. Theoretically, there are n! possible

orderings for each relation where n bits are used. Fortunately, we can reduce the

choice of variable orderings by exploiting the higher level semantics of relations.

Because bits within a domain represent the same type of data, they are likely to

be correlated, and therefore we do not consider permutations of bits within a domain.

In any case, elements within a domain can be deliberately numbered such that similar

elements will have similar numberings[253]. Therefore, each attribute in a relation can


be considered as a unit. For each pair of attributes a1 and a2 in a relation, orderings

between them are reduced to three choices: decision variables for a1 precede those

of a2, written a1 < a2, decision variables for a2 precede those of a1, written a1 > a2,

decision variables for a1 are interleaved with those of a2, written a1 ∼ a2. The number

of possible orderings for this reduced model is given by the sequence of ordered Bell

numbers[225], which still grows rapidly. The number of orderings for 1 through 8

variables are 1, 3, 13, 75, 541, 4683, 47293, 545835.

4.3.2 Problem Characteristics

As we discussed earlier, we have performed exhaustive searches over subsets of the

search space of BDD variable orders. Our observations from this experience, as

outlined below, led us to the conclusion that active machine learning is the appropriate

approach to solving this problem.

Simple heuristics do not work well. We have spent a long time studying

the efficient variable orders discovered by the exhaustive search, and define any char-

acterizing heuristics. We do know a few useful hints; for example, if two variables

are required to be equivalent, then interleaving their bit representation is effective.

The ordering used in input BDDs representing sizable relations can also be used as a

guide. However, these hints do not lead to the efficient variable orders we have found

through exhaustive search.

Features characterize the problem well. A number of key features decide the

execution time of the “nearly best” answers. Some of the features are not independent,

but instead are highly correlated.

Presence of critical rules. The performance of an analysis is often dominated

by a small subset of the rules. The problem can be somewhat decomposed into a rule-

by-rule basis, but there are interactions between different rules. Ideally, a relation

used by two different rules should have the same order so as to avoid potentially

expensive reorderings. Therefore, information from rules that operate on the same

relations/domains should be shared.


Dynamically varying behavior. Different applications of the same rule typi-

cally favor the same variable ordering, however, that is not necessarily the case. In

the iterative process of resolution, the nature of the inputs for a rule may change and

may, thus, favor a different variable ordering.

The search space is not smooth. Because certain features can have a signifi-

cant impact on performance and features are often correlated, the search space is not

smooth. The best variable order may be very similar to one which blows up, but just

differs in one or two key features.

The trials are expensive. It is not unusual for a rule application to take 2 min-

utes to complete even with a relatively good ordering. Good orders can outperform

bad ones by many orders of magnitude.

Feasibility is not guaranteed. There are problems for which there is no order-

ing that yields a compact answer. Sometimes, it is possible to reformulate a Datalog

program to make it feasible by avoiding the generation of very large intermediate sets

of data.

In sum, the BDD variable order problem is not easily conquered by simple heuris-

tics. Those few that we have found do not produce accurate generalizations. In our

experience, profiling has been the only technique that yields results. However, know-

ing what to profile is difficult to discern manually as the space of variable orders is

large. Exhaustive exploration has, until now, proved to be our only reliable resource.

But, as our program analyses have become larger and more complex, this approach

has become prohibitively expensive.

4.3.3 Integrating Learning into bddbddb

Figure 4.2 shows how we integrated machine learning into bddbddb. The system

starts by applying a Datalog rule to infer new relations. If the application of the rule

takes longer than a specified threshold, we invoke the active learning algorithm. The

active learner uses a training set of data that persists across different invocations of

bddbddb. If the learner performs any new trials, they are in turn added back into

the training set. The system continues applying rules until the relations converge.


Figure 4.2: The learning algorithm embedded in Datalog resolution.

At that point, we use the data in the training set to generate a new variable order.

We can then evaluate the new variable order or refine it further by repeating the

entire process until we obtain a satisfactory variable order. Repetition ensures that

all relevant rules are analyzed as a learned order may run slower on those rules that

were considered fast enough, and thus not considered, with the initial ordering.

The learning algorithm is triggered when some operation takes longer than a spec-

ified threshold. The algorithm makes use of a training set (described in Section 4.3.5),

which contains information on all trials performed in this and previous runs of bddb-

ddb. The training set is used to construct two tools that are used by the algorithm: an

order constructor (described in Section 4.3.6) and an uncertainty sampler (described

in Section 4.3.7).


Calculate bestuntried order

Goodenough?

Training Set

Y

N

Run order

Generatecandidate set

Uncertaintyclassifier

Do 1 trial

done

Figure 4.3: Steps of a learning episode. This corresponds to the contents of the ActiveLearner box in Figure 4.2.

4.3.4 Algorithm Overview

Figure 4.3 shows the steps of one invocation of the learning algorithm, called a learning

episode. The goal of a learning episode is to find a more efficient order for that

operation and build up the training set to improve order selection.

The learning episode proceeds as follows. First, we use the training data to predict

the best untried order. This order serves as a baseline measurement so that we can

abort the computation of orders that take significantly longer than our baseline. This

is important as slower orders can take orders of magnitude longer to complete. We

then generate a candidate set of orders to consider. We use an uncertainty sampler to

choose the order that we have the least information about. These orders are the most

useful for the learning algorithm. After trying an order, we incorporate the results

back into the training set and repeat.


4.3.5 Data Sets

Episode Rule Update Operation Order Time Classification

1 2 1 1v1 < v2 < h 5ms 1v2 < v1 ∼ h 15ms 1h < v2 < v1 50ms 2

2 3 1 1v1 < f < h2 < v2 < h1 ∞ 1v2 < h1 < h2 < f < v1 1000ms 1

3 2 2 3v2 ∼ v1 < h 10ms 2h ∼ v2 < v1 50ms 2

Table 4.1: The grouping of the data entries in our training set. Each episode is tiedto a particular rule, rule application, and operation within the rule. Within eachepisode the order along with its run time is stored.

The training set consists of the learning episodes we have executed thus far. Ta-

ble 4.1 depicts the association of each learning episode with a particular rule, rule

update, and BDD operation within that rule, along with the set of trials that it en-

compasses. Each trial of the episode is paired with its execution time. If there are no

previous trials at the beginning of a new episode for a particular rule, rule update,

and operation, we bootstrap the training set by trying a set of randomly-generated

orders.

The candidate set is the reduced space of orders that we would like to consider.

The order constructor generates this set by generating partial orders that exhibit

features that are either unseen or known to be good. The order constructor then

populates the candidate set with the enumeration of the total orders that follow from

these partial orders.

The trial set is the subset of orders in the candidate set that we know the least

about. The uncertainty sampler selects the orders in the candidate set whose probabil-

ity of being within the top class is closest to 50%. These orders have high uncertainty,

which yields high information gain[148].


4.3.6 Order Constructor

The order constructor is an ensemble of classifiers. A classifier is a function that

examines the features of an order and returns a prediction of how good that order is.

For our problem, as discussed in Section 4.3.1 we use pairwise ordering constraints

as features: given two elements, one can either occur before (<), after (>), or be

interleaved (∼) with the other.

Multi-View Decomposition

Although each learning episode focuses on just one rule, information learned from

other rules can also be relevant because they may operate on the same relations or

the same domains. We use the concept of multi-view decomposition to integrate infor-

mation from all of these different sources together. Therefore, the order constructor

is actually composed of three classifiers induced from three views of the training set.

1. For each rule, there is a “variable” classifier to gauge the effectiveness of an order

over the operations of just that particular rule. The features for instances are the

pairwise orderings of the rule’s variables. Thus, for Rule (3.2) in Algorithm 1,

these features are the pairwise orderings of v1, v2, h.

2. The “attribute” classifier incorporates global information about the quality of

orders over the relational attributes used by a rule. For a classifier of this

type, we translate variables to their relation attributes. Thus, for Rule (3.2),

the features used are the pairwise orderings of vP variable , vPheap , assigndest ,

assignsource . Translations that lead to contradictions are dropped.

3. The “domain” classifier makes reasonable, but possibly imprecise, decisions

about the quality of an order as it relates to all rule applications that have

used a related set of domains. This classifier is best quantified as a binary

decision between truly poor orders and ones that stand a reasonable chance of

termination. A classifier of this type for Rule (3.2) uses the pairwise orderings

of the domains of its variables: (V,V) and (V,H). Like in the attribute classifier,

translations that lead to contradictions are dropped.


Building Classifiers

We build the classifiers as follows. In the training set, we first normalize each trial

against the best observed run time for that rule application. Thus, the best order has

value 1, and other orders have higher values. We then split the normalized values into

a number of bins. We use an equal frequency binning algorithm; in our experience, this

approach is preferable to an equal width or clustering approach as we gain the explicit

interval refinement of clustering with the lower overhead of equal-width binning.

The number of bins we choose depends on the expected precision of the classifier.

The “rule”, “attribute” and “domain” classifiers are progressively less precise as each

incorporates more information that may be tangential or incomplete. Based on our

empirical tests, we chose the number of bins for these three classifiers to be√

n, 4√

n,

and 2, respectively, where n is the number of trials seen.

Once the trials have been split into discrete bins, we use them to build classifiers.

We experimented with many different types of classifiers, but settled on decision

trees; they seemed to work well for our problem as they were cheap to build and gave

accurate results. Decision trees have a feature at each node in the tree; depending on

the value of that feature, we traverse to one of the children. This continues recursively

until we reach a leaf node, which is the classification for the instance. As shown in

Figure 4.4, a node may have no successor for a particular attribute value. In this case,

a value denoting no classification is returned. We later exploit this fact to encourage

the algorithm to explore unseen features.

We use Quinlan’s simple ID3 algorithm[191] to build the classifiers from the train-

ing set. The ID3 algorithm partitions a dataset on the feature that gives the highest

information gain; that is, the highest estimated reduction in entropy resulting from

the partitioning of the dataset on a particular feature. Thus, by giving priority to

features with a higher information gain we can produce short trees with high degrees

of generalization.

For each node in the tree, the information gain of each feature is computed. The set

of instances is then partitioned by the feature with the highest expected information

gain. The algorithm then recurses on each subset until it is deemed that no further

information can be gained by partitioning.


1

v1,v2

1

< ~ >

v1,h

21

~ >

Figure 4.4: An example decision tree induced from the training set data in Table 4.1on rule 2.

Variable Ordering Construction

For each rule, the order constructor is used to generate both the best predicted order

for baseline calibration and the candidate set from which the uncertainty sampler

chooses the trial set. For each episode, we wish to run the fastest order first in order

to establish a baseline execution time for this operation. If subsequent trials begin to

take substantially longer than our baseline, we abort the computation and assign it

a large value for execution time.

To find the best predicted order, we combine the information from the variable,

attribute, and domain classifiers as follows. A performance class is assigned a score

equal to the mean run time of the members of that class. The scores from each of the

three classifiers are weighted by a factor related to the history of its accuracy. Correct

classifications increase that classifier’s weight, incorrect classifications decrease it. We

use an exponential decay function to diminish the penalties for mispredictions that

occurred in the past. This provides a feedback mechanism to automatically adjust the

weights based on the actual problem; depending on the problem, certain classifiers

may be more or less accurate than others.

Because there are typically far too many orders to iterate over for an operation,

we extract the features that lead to the best order directly from the classifiers. This

problem is complicated by the fact that we have three separate classifiers and the


progression of scores in the performance classes of the classifiers is non-linear. We

solve this by first computing the scores from every possible combination of classes

and then sorting the combinations by their scores. We use the sorted combinations

of classes to extract the features from each classifier and combine them. We skip

combinations that lead to constraints that cannot be combined.

The features collected so far represent a set of constraints. A simple procedure can

enumerate all the total orderings that satisfy these constraints. From this set of total

orderings we can select the best predicted order or the top n candidates to populate

the candidate set. To include feature combinations that have not been evaluated

before, nodes with no instances in the decision trees are boosted to have the highest

performance class. Thus, the candidate set will contain a mixture of features that are

known to be good and ones that we are uncertain about.

4.3.7 Uncertainty Sampler

The goal of the uncertainty sampler is to maximize information gain by trying or-

derings that carry the greatest uncertainty. While this may seem orthogonal to our

goal of optimization, it has been shown that relevance sampling[214] (selecting the

input classified as the best) produces poor classifiers as compared to uncertainty

methods[148].

For our problem, we are interested in knowing if a given variable ordering falls in

the top performing class. We refer to the top performance class as the positive class,

and the rest as the negative class. An instance is given probability 1 if it definitely

belongs to the positive class, and 0 if it certainly does not. Thus a probability of

0.5 represents maximum uncertainty. We use Bagged Probability Estimate Trees

(BPETs) to create our uncertainty sampler[39]. BPETs have been shown to produce

better estimates for unstable classifiers — classifiers in which small variations in data

produce large variations in prediction — such as ours[27].

We use Laplace correction and BPETs to create three estimators — one each for

rules, attributes, and domains — from the training set. The probability estimate for

a particular variable ordering is computed by combining the probability estimates of


each estimator. To combine the three estimators, we compute a score which is the

distance from a chosen centroid using a root-mean-squared computation:√(Er −R)2 + (Ea − A)2 + (Ed −D)2

3,

where Pr, Pa, and Pd are the class probability estimates returned by our rule, attribute

and domain classifiers, and R,A, and D are the centroids of focus for our search. As

uncertainty sampling dictates, to choose the order that has the most uncertainty we

would set R, A, and D to be 0.5, and find the order with the smallest score. Thus,

a score of 0, given these centroids, indicates that our all three of our estimators

predicted a class probability of 0.5.

We have found that by adjusting the centroids of this simple model appropriately,

we can direct the focus of our search. Therefore, a lower centroid biases the search

toward more exploration, while a high centroid directs the search to areas of higher

confidence. In our experiments, we used a centroid position of (0.5,0.5,1), biasing the

search toward orders that the domain estimator deems to be relatively good. As a

result, the initial trials for newly encountered rules typically follow the optimal global

scheme.

4.3.8 Generating a New Order

At the end of the learning process, we combine the information from the trials in the

training set to generate a global set of constraints on the variable order. Global con-

straint generation is complicated by the fact that we have a distributed record of runs;

our trials span across rules of varying run-time and hence importance. Also, domi-

nant performance features for a collection of rules are not necessarily consistent. As

a result, we construct the solution to this problem as a Fuzzy Constraint Satisfaction

Problem (FCSP)[209]. A FCSP is a constraint satisfaction problem where constraints

can have different levels of importance and potentially be contradictory. Several

heuristic methods have been proposed for solving FCSPs[107]; we use a branch-and-

bound approach as it is the most straightforward.

We want to give priority to rules with longer execution times, so we generate the


fuzzy constraints as follows. For each rule, sets of constraints that lead to the high-

performing orders are generated as in Section 4.3.6. Each individual constraint is

given a score computed as its relative performance multiplied by the total execution

time of the rule. The votes for each pair of BDD domains and its different orientations

(before, after, or interleaved) are totaled globally and a fuzzy constraint for the pair

of domains is created.

At the end of the computation, we need to combine the information into a single

global total order of BDD domains. The technique is similar to that described in

Section 4.3.6. Instead of creating an ensemble classifier specific to one rule, we create

an ensemble classifier consisting of all the “rule” and “attribute” classifiers in the

system, and find the single total order that maximizes the score from that ensemble

classifier. Each variable classifier in the ensemble classifier is weighted by the total

execution time of the rule that it is associated with; in this way, constraints from

longer-running rules weigh more heavily in the global decision.

At this point, we have a total order on relation attributes. We then assign each

attribute to a physical BDD domain of the correct size, collapsing domains that

are adjacent or only separated by independent attributes. In this step, we do not

make any attempt to minimize the number of rename operations that are required.

However, a rename operation on a BDD that does not change the relative order of

the variables in the BDD is an inexpensive operation; it is basically a relabeling of

some nodes in the BDD. Rename operations that change the relative order can be

very expensive because the result is not isomorphic to the original BDD. Our method

of construction of the total order allows us to avoid some expensive renames at the

potential cost of some extra inexpensive ones.

4.4 Querying bddbddb

bddbddb includes an interactive mode where the user can type in individual rules,

relations, and queries and the bddbddb engine will solve them as they are entered.

Queries use the standard Datalog “?” syntax.


4.4.1 The Come-from Query

One of the benefits of writing program analyses in a high level language like Datalog

is that we can perform high-level transforms on the analysis specification to make

the analysis do different things. We showed an example of this in Chapter 3 when

we added context sensitivity to an existing pointer analysis by just adding an extra

context attribute and a few extra rules.

This section gives an example of an automated transformation we can do on a

Datalog program to “reverse its execution”. Given a predicate, bddbddb can compute

the set of rules and input relations that cause the given predicate to become true.

We refer to this as a come-from query. The come-from query proved highly useful in

program analysis, as it gives a trace of events that shows exactly why some analysis

fact is true, which could be used to e.g. show the exact trace of events that lead to

a security violation. It also made it much easier to understand the results we were

getting from our program analyses and in debugging our Datalog programs and input

relations.

The following is an example of a come-from query:

R(1, 2, 3) : – ?

This query returns the set of rules and input relation values that contribute to the

tuple (1, 2, 3) being added to relation R. The query can contain variables as well as

constants.

Evaluating the Come-from Query

bddbddb evaluates the come-from query by transforming the input Datalog program

into a new program that reverses its execution. Algorithm 7 shows how bddbddb

generates the reversed-execution program.

The goal of Algorithm 7 is to generate a new program with additional relations

that contain the tuples that contribute to the answers. It maps from a relation R in

the original Datalog program into a relation Rq in the new Datalog program. The

relation Rq contains all tuples in R that contribute to the final answer.


Algorithm 7 Implementing a Datalog come-from query.

Input: A Datalog program and a come-from query R(x, y, z) : – ?

Output: A complementary Datalog program that implements the come-from query

Method:

1. Create a mapping M from relations in the original Datalog program and re-lations in the new Datalog program. Initialize the mapping with one entryR ⇒ Rq, which maps from the relation R specified in the come-from query to anew relation Rq with identical attributes. The new relation Rq will contain thetuples in relation R that contribute to the answer.

2. Add an inference rule Rq(x, y, z) : – R(x, y, z) to the new Datalog program.This is used to seed the Rq relation with the matching contents of the R relation.

3. Create a worklist of relations to be processed. Insert the original come-fromquery relation R into the worklist.

4. While the worklist of relations is not empty,

Pull a relation A from the worklist.Find its matching relation Aq in the relation mapping M .For each rule T in the original Datalog program where A appears as the

head relation,For each relation B that appears in the body of T ,

If B ⇒ Bq is not in mapping M , add it and add B to the worklist.Create a new rule Tq that has the same subgoals as T , with an

additional subgoal Aq, and a head relation AT that contains allvariables in rule T .

For each subgoal C(x, y, z) in rule T ,Find C’s matching relation Cq in mapping M .Create a new rule TCq with head relation Cq(x, y, z) and a single

subgoal matching the head of rule T .

5. Mark all target relations in the mapping as output relations.

2


To determine the contents of Rq, we need to look at every rule in the original

Datalog program that can add tuples to R; these are the rules where R appears in

the head relation. For each of these original rules, we add a new rule that encodes

the variable assignments in the original rule that contribute to the final answer. We

achieve this by including Rq as an extra subgoal in the rule. Finally, for each subgoal

C(x, y, z) in the original rule, we add a new rule that adds the contents of the head

relation of the original rule to Cq(x, y, z). Because we store the contribution sets Rq

as normal relations, the algorithm works correctly in the presence of cycles.

For example, the Datalog program generated by the come-from query

“vP(100, 200) : – ?” on the context-insensitive pointer analysis Algorithm 1 from

Section 3.1 is shown in Figure 4.5. This query returns the chain of reasoning as to

why variable 100 points to heap object 200. The first rule is added by step 2 of

the algorithm. The next three rules come from Rule (3.2), the following four from

Rule (3.4), the following two from Rule (3.1), and the final three from Rule (3.3).

The come-from query is especially useful for exploring the nature of a program

analysis. For example, a come-from query can give the user a concrete reason for a

particular analysis imprecision. It can also give example traces to show the existence

of e.g. a security vulnerability. It is also useful for debugging a Datalog program or

the input relations, as it tells you exactly why a given tuple is added to a relation.


vP q(100, 200) : – vP(100, 200).

vP q2(v1, h, v2) : – vP q(v1, h), assign(v1, v2), vP(v2, h).

assignq(v1, v2) : – vP q2(v1, h, v2).

vP q(v2, h) : – vP q2(v1, h, v2).

vP q4(v2, h2, v1, f, h1) : – vP q(v2, h2), load(v1, f, v2), vP(v1, h1), hP(h1, f, h2).

load q(v1, f, v2) : – vP q4(v2, h2, v1, f, h1).

vP q(v1, h1) : – vP q4(v2, h2, v1, f, h1).

hP q(h1, f, h2) : – vP q4(v2, h2, v1, f, h1).

vP q1(v, h) : – vP q(v, h), vP0(v, h).

vP0q(v, h) : – vP q1(v, h).

hP q3(h1, f, h2, v1, v2) : – hP q(h1, f, h2), store(v1, f, v2), vP(v1, h1), vP(v2, h2).

storeq(v1, f, v2) : – hP q3(h1, f, h2, v1, v2).

vP q(v1, h1) : – hP q3(h1, f, h2, v1, v2).

vP q(v2, h2) : – hP q3(h1, f, h2, v1, v2).

Figure 4.5: Datalog program generated by bddbddb to compute come-from query“vP(100, 200) : – ?” on Algorithm 1. The second, third, fourth, and fifth chunks ofrules come from Rules (3.2), (3.4), (3.1), and (3.3) respectively.

Chapter 5

Applications for Program Analysis

The context-sensitive pointer analysis, described in Chapter 3, gives highly accurate

pointer information, and bddbddb, described in Chapter 4, gives a powerful yet con-

venient engine for querying that information. bddbddb makes it very easy to design

new analyses based on existing ones, and myself and others have used it to implement

interesting and practical analyses with relatively little effort. This chapter describes

some queries and analyses that people have implemented using the pointer analysis

information and bddbddb. We have focused on a few interesting analyses that show

off the system’s power and ease of use.

Section 5.1 describes some analysis queries and simple extensions to the algorithms

in Chapter 3. Section 5.2 describes a sophisticated analysis to analyze the use of

reflection and build a more complete call graph. It takes advantage of many kinds

of analysis information, including pointers, types, and casts. The reflection analysis

is based on joint work with Livshits[157, 158]. Section 5.3 describes a sophisticated

static analysis for finding data races in Java programs. The specification is simple

but the analysis is very powerful and efficient. The static race detection algorithm

was designed by Naik[177] and has found hundreds of real data races in popular Java

applications and libraries. Section 5.4 contains a collection of program analyses that

others have been implemented using bddbddb.

83

CHAPTER 5. APPLICATIONS FOR PROGRAM ANALYSIS 84

5.1 Queries and Other Analyses

The algorithms in Chapter 3 generate vast amounts of results in the form of relations.

Using the same declarative programming interface, we can conveniently query the

results and extract exactly the information we are interested in. This section shows

a variety of queries and analyses that make use of pointer information and context

sensitivity.

5.1.1 Debugging a Memory Leak

Memory leaks can occur in Java when a reference to an object remains even after it

will no longer be used. One common approach of debugging memory leaks is to use a

dynamic tool that locates the allocation sites of memory-consuming objects. Suppose

that, upon reviewing the information, the programmer thinks objects allocated in line

57 in file a.java should have been freed. He may wish to know which objects may

be holding pointers to the leaked objects, and which operations may have stored the

pointers. He can consult the static analysis results by supplying the queries:

whoPointsTo57 (h, f) : – hP(h, f, “a.java:57”).

whoDunnit(c, v1, f, v2) : – store(v1, f, v2), vPc(c, v2, “a.java:57”).

The first query finds the objects and their fields that may point to objects allocated

at “a.java:57”; the second finds the store instructions, and the contexts under which

they are executed, that create the references.

5.1.2 Heap Connectivity

When using a reference-counting garbage collector, it is useful to know whether or

not an object can be involved in a cycle[17]; acyclic objects do not need to be cycle-

collected. To find cyclic objects, we simply compute a transitive closure on the hP

relation and see if an object can transitively point to itself.

reachable(h1, h2) : – hP(h1, , h2).


reachable(h1, h3) : – reachable(h1, h2), hP(h2, , h3).

cyclic(h1) : – reachable(h1, h2), h1 = h2.

The heap connectivity relation (reachable) can also be used to find the set of

objects that can be accessed by unknown code, or to guide partitioning strategies for

garbage collectors[116, 117, 124, 126].

5.1.3 Finding a Security Vulnerability

The Java Cryptography Extension (JCE) is a library of cryptographic algorithms[228].

Misuse of the JCE API can lead to security vulnerabilities and a false sense of security.

For example, many operations in the JCE use a secret key that must be supplied by

the programmer. It is important that secret keys be cleared after they are used so

they cannot be recovered by attackers with access to memory. Since String objects are

immutable and cannot be cleared, secret keys should not be stored in String objects

but in an array of characters or bytes instead.

To guard against misuse, the function that accepts the secret key, PBEKeySpec.init(),

only allows arrays of characters or bytes as input. However, a programmer not versed

in security issues may have stored the key in a String object and then use a routine

in the String class to convert it to an array of characters. We can write a query to

audit programs for the presence of such idioms. Let Mret(m, v) be an input relation

specifying that variable v is the return value of method m. We define a relation

fromString(h) which indicates if the object h was directly derived from a String.

Specifically, it records the objects that are returned by a call to a method in the

String class. An invocation i to method PBEKeySpec.init() is a vulnerability if the first

argument points to an object derived from a String.

fromString(h) : – cha(“String”, , m),Mret(m, v), vPc( , v, h).

vuln(c, i) : – IE (i, “PBEKeySpec.init()”), actual(i, 1, v),

vPc(c, v, h), fromString(h).

Notice that this query does not only find cases where the object derived from a


String is immediately supplied to PBEKeySpec.init(). This query will also identify cases

where the object has passed through many variables and heap objects.

5.1.4 Aliased Parameters

We can easily use the points-to information to discover if parameters are aliased

under certain contexts. In many cases code can be optimized if it can be proven

that parameters are not aliased. Because it is context-sensitive, this can be used to

automatically redirect call sites to call specialized versions of methods.

aliased(c, m, z1, z2) : – formal(m, z1, v1), formal(m, z2, v2),

z1 6= z2, vPc(c, v1, h), vPc(c, v2, h).

5.1.5 Type Refinement

Libraries are written to handle the most general types of objects possible, and their

full generality is typically not used in many applications. By analyzing the actual

types of objects used in an application, we can refine the types of the variables and

object fields. Type refinement can be used to reduce overheads in cast operations,

resolve virtual method calls, and gain better understanding of the program.

We say that variable v can be legally declared as t, written varSuperTypes(v, t), if

t is a supertype of the types of all the objects v can point to. The type of a variable is

refinable if the variable can be declared to have a more precise type. To compute the

super types of v, we first find varExactTypes(v, t), the types of objects pointed to by

v. We then intersect the supertypes of all the exact types to get the desired solution;

we do so in Datalog by finding the complement of the union of the complement of

the exact types.

varExactTypes(v, t) : – vPc( , v, h), hT (h, t).

notVarType(v, t) : – varExactTypes(v, tv),¬aT (t, tv).

varSuperTypes(v, t) : – ¬notVarType(v, t).

refinable(v, tc) : – vT (v, td), varSuperTypes(v, tc), aT (td, tc), td 6= tc.


The above shows a context-insensitive type refinement query. We find, for each

variable, the type to which it can be refined regardless of the context. Even if the

end result is context-insensitive, it is more precise to take advantage of the context-

sensitive points-to results available to determine the exact types, as shown in the first

rule. In Section 6.1.3, we compare the accuracy of this context-insensitive query with

a context-sensitive version.

5.1.6 Interprocedural Data Flow

By combining the points-to relations that we compute with the initial relations from

the program statements, we can construct relations that model the flow of data across

method calls and through pointer indirections.

usedef (cv2 , v2, cv1 , v1) : – assignc(cv1 , v1, cv2 , v2).

usedef (cv2 , v2, cv1 , v1) : – load(vb2 , f, v2), vPc(cv2 , vb2 , hb),

store(vb1 , f, v1), vPc(cv1 , vb1 , hb).

defuse(cv1 , v1, cv2 , v2) : – usedef (cv2 , v2, cv1 , v1).

By taking the transitive closure on the usedef or defuse relations, we can obtain

a program slice of the flow of data to or from a point. The slice of the locations

leading up to a variable is usedef*, while the slice of locations following a variable is

defuse*. These slices are useful for taint analysis[220], provenance analysis[145], and

flow analysis[88].

usedef ∗(cv2 , v2, cv1 , v1) : – usedef (cv2 , v2, cv1 , v1).

usedef ∗(cv3 , v3, cv1 , v1) : – usedef ∗(cv3 , v3, cv2 , v2),

usedef (cv2 , v2, cv1 , v1).

defuse∗(cv1 , v1, cv2 , v2) : – defuse(cv1 , v1, cv2 , v2).

defuse∗(cv1 , v1, cv3 , v3) : – defuse∗(cv1 , v1, cv2 , v2),

defuse(cv2 , v2, cv3 , v3).


5.1.7 Context-Sensitive Mod-Ref Analysis

Mod-ref analysis is used to determine what fields of what objects may be modified or

referenced by a statement or call site[146]. We can use the context-sensitive points-to

results to solve a context-sensitive version of this query. We define mV (m, v) to mean

that v is a local variable in m. The mV ∗c relation specifies the set of variables and

contexts of methods that are transitively reachable from a method. mV ∗c(c1, m, c2, v)

means that calling method m with context c1 can transitively call a method with

local variable v under context c2.

mV ∗c(c, m, c, v) : – mV (m, v).

mV ∗c(c1, m1, c3, v3) : – mI (m1, i), IE c(c1, i, c2, m2),

mV ∗c(c2, m2, c3, v3).

The first rule simply says that a method m in context c can reach its local variable.

The second rule says that if method m1 in context c1 calls method m2 in context c2,

then m1 in context c1 can also reach all variables reached by method m2 in context

c2.

We can now define the mod and ref set of a method as follows:

mod(c, m, h, f) : – mV ∗c(c, m, cv, v),

store(v, f, ), vPc(cv, v, h).

ref (c, m, h, f) : – mV ∗c(c, m, cv, v),

load(v, f, ), vPc(cv, v, h).

The first rule says that if method m in context c can reach a variable v in context

cv, and if there is a store through that variable to field f of object h, then m in

context c can modify field f of object h. The second rule for defining the ref relations

is analogous.


5.1.8 Context-Sensitive Type Analysis

Our cloning technique can be applied to add context sensitivity to other context-

insensitive algorithms. The example we show here is the type inference of variables

and fields. By not distinguishing between instances of heap objects, this analysis

does not generate results as precise as those extracted from running the complete

context-sensitive pointer analysis as discussed in Section 5.1.5, but is much faster.

The basic type analysis is similar to 0-CFA[222]. Each variable and field in the

program has a set of concrete types that it can refer to. The sets are propagated

through calls, returns, loads, and stores. By using the path numbering scheme in

Algorithm 4, we can convert this basic analysis into one which is context-sensitive—

in essence, making the analysis into a k-CFA analysis where k is the depth of the call

graph and recursive cycles are collapsed.

Algorithm 8 Context-sensitive type analysis.

DomainsDomains from Algorithm 5


vT c (context : C, variable : V, type : T) outputfT (field : F, target : T) outputvTfilter (variable : V, type : T)

Rules

vTfilter(v, t) : – vT (v, tv), aT (tv, t). (5.1)

vT c(c, v, t) : – vP0(v, h), IE c(c, h, , ), hT (h, t). (5.2)

vT c(cv1 , v1, t) : – assignc(cv1 , v1, cv2 , v2), vT c(cv2 , v2, t),vTfilter(v1, t). (5.3)

fT (f, t) : – store( , f, v2), vT c( , v2, t). (5.4)

vT c( , v, t) : – load( , f, v), fT (f, t), vTfilter(v, t).

assignc(c1, v1, c2, v2) : – IE c(c2, i, c1, m), formal(m, z, v1),actual(i, z, v2). (5.5)

2


vT c: C × V × T is the context-sensitive variable type relation. vT c(c, v, t) means

that variable v in context cv can refer to an object of type t. This is the analogue

of vPc in the points-to analysis.

fT : F × T is the field type relation. fT (f, t) means that field f can point to an

object of type t.

vTfilter : V × T is the type filter relation. vTfilter(v, t) means that it is type-safe to

assign an object of type t to variable v.

Rule (5.2) initializes the vT c relation based on the initial local points-to infor-

mation contained in vP0, combining it with hT to get the type and IE c to get the

context numbers. Rule (5.3) does transitive closure on the vT c relation, filtering

with vTfilter to enforce type safety. Rules (5.4) and (5.5) handle stores and loads,

respectively. They differ from their counterparts in the pointer analysis in that they

do not use the base object, only the field. Rule (5.5) models the effects of parameter

passing in a context-sensitive manner.

5.1.9 Thread Escape Analysis

Our last example is a thread escape analysis, which determines if objects created

by one thread may be used by another. The results of the analysis can be used for

optimizations such as synchronization elimination and allocating objects in thread-

local heaps, as well as for understanding programs and checking for possible race

conditions due to missing synchronizations[62, 255]. This example illustrates how we

can vary context sensitivity to fit the needs of the analysis.

We say that an object allocated by a thread has escaped if it may be accessed by

another thread. This notion is stronger than most other formulations where an object

is said to escape if it can be reached by another thread[62, 255].

Java threads, being subclasses of java.lang.Thread, are identified by their creation

sites. In the special case where a thread creation can execute only once, a thread

can simply be named by the creation site. The thread that exists at virtual machine

startup is an example of a thread that can only be created once. A creation site


reached via different call paths or embedded in loops or recursive cycles may generate

multiple threads. To distinguish between thread instances created at the same site,

we create two thread contexts to represent two separate thread instances. If an object

created by one instance is not accessed by its clone, then it is not accessed by any

other instances created by the same call site. This scheme creates at most twice as

many contexts as there are thread creation sites.

We clone the thread run() method, one for each thread context, and place these

clones on the list of entry methods to be analyzed. Methods (transitively) invoked by

a context’s run() method all inherit the same context. A clone of a method not only

has its own cloned variables, but also its own cloned object creation sites. In this way,

objects created by separate threads are distinct from each other. We run a points-to

analysis over this slightly expanded call graph; an object created in a thread context

escapes if it is accessed by variables in another thread context.

Ht: C × H encodes the non-thread objects created by a thread. Ht(c, h) means

that a thread with context c may execute non-thread allocation site h; in other

words, there is a call path from the run() method in context c to allocation site

h.

vP0t: C × V × C × H is the set of initial inter-thread points-to relations. This

includes the points-to relations for thread creation sites and for the global object.

vP0t(c1, v, c2, h) means that thread c1 has an thread allocation site h, and v

points to the newly created thread context c2. (There are usually two contexts

assigned to each allocation site). All global objects across all contexts are given

the same context.

vPt: C × V × C × H is the thread-sensitive version of the variable points-to relation

vPc. vPt(c1, v, c2, h) means variable v in context c1 can point to heap object h

created under context c2.

hPt: C × H × F × C × H is the thread-sensitive version of the heap points-to

relation hP . hPt(c1, h1, f, c2, h2) means that field f of heap object h1 created

under context c1 can point to heap object h2 created under context c2.


Algorithm 9 Thread-sensitive pointer analysis.



Ht (c : C, heap : H) inputvP0t (cv : C, variable : V, ch : C, heap : H) inputvPt (cv : C, variable : V, ch : C, heap : H) outputhPt (cb : C, base : H, field : F, ct : C, target : H) output

Rules


vPt(c1, v, c2, h) : – vP0t(c1, v, c2, h). (5.7)

vPt(c, v, c, h) : – vP0(v, h),Ht(c, h). (5.8)

vPt(c2, v1, ch, h) : – assign(v1, v2), vPt(c2, v2, ch, h),vPfilter(v1, h). (5.9)

hPt(c1, h1, f, c2, h2) : – store(v1, f, v2), vPt(c, v1, c1, h1),vPt(c, v2, c2, h2). (5.10)

vPt(c, v2, c2, h2) : – load(v1, f, v2), vPt(c, v1, c1, h1),hPt(c1, h1, f, c2, h2), vPfilter(v2, h2). (5.11)

2

Rule (5.7) incorporates the initial points-to relations for thread creation sites.

Rule (5.8) incorporates the points-to information for non-thread creation sites, which

have the context numbers of threads that can reach the method. The other rules

are analogous to those of the context-sensitive pointer analysis, with an additional

context attribute for the heap objects.

From the analysis results, we can easily determine which objects have escaped.

An object h created by thread context c has escaped, written escaped(c, h), if it is

accessed by a different context cv. Complications involving unknown code, such as

native methods, could also be handled using this technique.

escaped(c, h) : – vPt(cv, , c, h), cv 6= c.


Conversely, an object h created by context c is captured, written captured(c, h),

if it has not escaped. Any captured object can be allocated on a thread-local heap.

captured(c, h) : – vPt(c, v, c, h),¬escaped(c, h).

We can also use escape analysis to eliminate unnecessary synchronizations. We

define a relation syncs(v) indicating if the program contains a synchronization oper-

ation performed on variable v. A synchronization for variable v under context c is

necessary, written neededSyncs(c, v), if syncs(v) and v can point to an escaped object.

neededSyncs(c, v) : – syncs(v), vPt(c, v, ch, h), escaped(ch, h).

Notice that neededSyncs is context-sensitive. Thus, we can distinguish when a

synchronization is necessary only for certain threads, and generate specialized versions

of methods for those threads.

5.2 Pointer Analysis for Reflection

Reflection is a Java language feature that allows the programmer to perform opera-

tions dynamically by specifying their names[90, 106]. For example, using reflection a

programmer can load an class, create a new object, read or write to a field, or call a

method simply by specifying its name. Reflection has long been a thorn in the side of

Java static analysis tools. Without handling reflection, static analysis tools are both

incomplete because they miss parts of the program and unsound because they do not

take into account the effects of reflective calls.

Determining the target of a reflective call is obviously an undecidable problem in

general. However, many uses of reflection are stylized and therefore can potentially

be accurately analyzed[157]. This section describes how we use pointer analysis and

bddbddb to resolve reflective calls and also find specification points, which are external

sources that propagate string values to a reflective call.


Reflection and Points-to Information

The programming idiom that motivated the use of pointer analysis for resolving re-

flection is shown in Figure 5.1. This idiom consists of the following steps:

1. Obtain the name of the class for the object that needs to be created.

2. Create a Class object by calling the static method Class.forName.

3. Create the new object with a call to Class.newInstance.

4. Cast the result of the call to Class.newInstance to the necessary type in order to

use the newly created object.

When interpreting this idiom statically, we would like to “resolve” the call to

Class.newInstance in step 3 as a call to the default constructor T(). However, ana-

lyzing even this relatively simple idiom is nontrivial.

The four steps shown above can be widely separated in the code and reside in

different methods, classes, or jar libraries. The Class object obtained in step 2 may

be passed through several levels of function calls before being used in step 3. Further-

more, the Class object can be deposited in a collection to be later retrieved in step 3.

The same is true for the name of the class created in step 1 and used later in step 2.

To determine how variables className, c, o, and t defined and used in steps 1–4 may

be related, we need points-to information.

Resolution of Class.newInstance of Class.forName calls is not the only thing made

possible with pointer analysis and bddbddb — we also track Method, Field, and

Constructor objects. This allows us to correctly resolve reflective method invocations

and field accesses. Reflection is also commonly used to invoke the class constructor

1. String className = ...;2. Class c = Class.forName(className);3. Object o = c.newInstance();4. T t = (T) o;

Figure 5.1: Typical use of reflection to create new objects.


of a given class via calling Class.forName with the class name as the first argument.

We use pointer information to determine potential targets of Class.forName calls and

add calls to class constructors of the appropriate classes to the call graph.

Reflection Resolution Algorithm

The algorithm for computing targets of reflective calls is naturally expressed in Dat-

alog. Below we define the Datalog rules to resolve targets of Class.newInstance and

Class.forName calls. Handling of constructors, methods, and fields proceed similarly.

To compute reflective targets of calls to Class.newInstance, we define two Datalog

relations. Relation classObjects contains pairs 〈i, t〉 of invocations sites i ∈ I call-

ing Class.forName and types t ∈ T that may be returned from the call. We define

classObjects using the following Datalog rule:

classObjects(i, t) : – IE (i, “Class.forName”),

actual(i, 1, v), vPc(v, s), string2class(s, t).

The Datalog rule for classObjects reads as follows. Invocation site i returns an object

of type t if the call graph relation IE contains an edge from i to “Class.forName”,

parameter 1 of i is v, v points to s, and s is a string that represents the name of type

t.

Relation newInstanceTargets contains pairs 〈i, t〉 of invocation sites i ∈ I calling

Class.newInstance and classes t ∈ T that may be reflectively invoked by the call. The

Datalog rule to compute newInstanceTargets is:

newInstanceTargets(i, t) : – IE (i, “Class.newInstance”),

actual(i, 0, v), vPc(v, c),

vPc(vc, c), ret(ic, vc), classObjects(ic, t).

The rule reads as follows. Invocation site i returns a new object of type t if the call

graph relation IE contains an edge from i to Class.newInstance, parameter 0 of i is v,

v is aliased to a variable vc that is the return value of invocation site ic, and ic returns


type t. Targets of Class.forName calls are resolved and calls to the appropriate class

constructors are added to the invocation relation IE :

IE (i, m) : – classObjects(i, t), m = t + “. < clinit >”.

(The “+” sign indicates string concatenation.) Similarly, having computed relation

newInstanceTargets(i, t), we add these reflective call targets invoking the appropriate

type constructor to the call graph relation IE with the rule below:

IE (i, m) : – newInstanceTargets(i, t), m = t + “. < init >”.

Handling Constructor and Other Objects

Another technique of reflective object creation is to use Class.getConstructor to

get a Constructor object, and then calling newInstance on that. We define a re-

lation constructorTypes that contains pairs 〈i, t〉 of invocations sites i ∈ I calling

Class.getConstructor and types t ∈ T of the type of the constructor:

constructorTypes(i, t) : – IE (i, “Class.getConstructor”),

actual(i, 0, v), vPc(v, h), classObjects(h, t).

Once we have computed constructorTypes , we can compute more newInstanceTargets

as follows:

newInstanceTargets(i, t) : – IE (i, “Class.newInstance”),

actual(i, 0, v), vPc(v, c), vPc(vc, c), ret(ic, vc),

constructorTypes(ic, t).

This rule says that invocation site i calling “Class.newInstance” returns an object of

type t if parameter 0 of i is v, v is aliased to the return value of invocation ic which

calls “Class.getConstructor”, and the call to ic is on type t.

In a similar manner, we can add support for Class.getConstructors, along with sup-

port for reflective field, and method accesses. The specification of these are straight-

forward and we do not describe them here. Our actual implementation completely


loadImpl() @ 43 InetAddress.java:1231 => java.net.Inet4AddressImplloadImpl() @ 43 InetAddress.java:1231 => java.net.Inet6AddressImpl...lookup() @ 86 AbstractCharsetProvider.java:126 => sun.nio.cs.ISO_8859_15lookup() @ 86 AbstractCharsetProvider.java:126 => sun.nio.cs.MS1251...tryToLoadClass() @ 29 DataFlavor.java:64 => java.io.InputStream...

Figure 5.2: A fragment of a specification file accepted by our system. A stringidentifying a call site to Class.forName is mapped to a class name that that call mayresolve to.

models all methods in the Java Reflection API. We refer the reader to a technical

report for more details[157].

Specification Points and User-Provided Specifications

Using a pointer analysis and bddbddb also allows us to determine the provenance of

any non-constant string passed to Class.forName. The provenance of a string is in

essence a backward data slice showing the flow of data to that string. Provenance al-

lows us to compute specification points—places in the program where external sources

are read by the program from a configuration file, system properties, etc. For each

specification point, the user can provide values that may be passed into the applica-

tion.

We compute the provenance by propagating through the assignment relation

assign, aliased loads and stores, and string operations. To make the specification

points as close to external sources as possible, we perform a simple analysis of strings

to do backward propagation through string concatenation operations. For brevity, we

only list the StringBuffer.append method used by the Java compiler to expand string

concatenation operations here; other string operations work in a similar manner. The

following rules for relation leadsToForName detail provenance propagation:


leadsToForName(v, i) : – IE (i, “Class.forName”), actual(i, 1, v).

leadsToForName(v2, i) : – leadsToForName(v1, i), assign(v1, v2).

leadsToForName(v2, i) : – leadsToForName(v1, i),

load(v3, f, v1), vPc(v3, h3), vPc(v4, h3), store(v4, f, v2).

leadsToForName(v2, i) : – leadsToForName(v1, i), ret(i2, v1),

IE (i2, “StringBuffer.append”), actual(i2, 0, v2).

leadsToForName(v2, i) : – leadsToForName(v1, i), ret(i2, v1),


leadsToForName(v2, i) : – leadsToForName(v1, i), actual(i2, 0, v1),


To compute the specification points necessary to resolve Class.forName calls, we

find endpoints of the leadsToForName propagation chains that are not string con-

stants that represent class names. These will often terminate in the return result

of a call to System.getProperty in the case of reading from a system property or

BufferedReader.readLine in the case of reading from a file. By specifying the possi-

ble values at that point that are appropriate for the application being analyzed, the

user can construct a complete call graph.

Our implementation accepts specification files that contain a simple textual map

of a specification point to the constant strings it can generate. A specification point

is represented by a method name, bytecode offset, and the relevant line number. An

example of a specification file is shown in Figure 5.2.

5.2.1 Reflection Resolution Using Casts

For some applications, the task of providing reflection specifications may be too heavy

a burden. Fortunately, we can leverage the type cast information present in the pro-

gram to automatically determine a conservative approximation of possible reflective

targets. Consider, for instance, the following typical code snippet:


1. Object o = c.newInstance();

2. String s = (String) o;

The cast in statement 2 post-dominates the call to Class.newInstance in statement 1.

This implies that all execution paths that pass through the call to Class.newInstance

must also go through the cast in statement 2[5]. For statement 2 not to produce a

runtime exception, o must be a subclass of String. Thus, if we make the assumption

that the code is correct and does not throw an exception, only subtypes of String

can be created as a result of the call to newInstance. More generally, if the result

of a newInstance call is always cast to type t, we say that only subtypes of t can be

instantiated at the call to newInstance.

Relying on cast operations can possibly be unsound as the cast may fail, in which

case, the code will throw a ClassCastException. Thus, in order to work, our cast-based

technique relies on the correctness of cast operations.

Preparing Subtype Information

We assume a “closed-world”; that is, only classes reachable from the class path at

analysis time can be used by the application at runtime. The classes available at

analysis time are generally distributed with the application. However, occasionally,

there are classes that are generated when the application is compiled or deployed,

typically with the help of an Ant script. Therefore, we generate the set of possible

classes after deploying the application.

We preprocess all resulting classes to compute the subtyping relation

subtype(t1, t2) that determines when t1 is a subtype of t2. Preprocessing even the

smallest applications involved looking at many thousands of classes because we con-

sider all the default jars that the Java runtime system has access to. We run this

preprocessing step off-line and store the results for easy access.

Using Cast Information

We integrate the information about cast operations directly into the system of con-

straints expressed in Datalog. We use a Datalog relation subtype described above, a


relation cast that holds the cast operations, and a relation unresolved that holds the

unresolved calls to Class.forName. The following Datalog rule uses cast operations ap-

plied to the return result vret of a call i to Class.newInstance to constrain the possible

types tc of Class objects c returned from calls sites ic of Class.forName:

classObjects(ic, t) : – IE (i, “Class.newInstance”), actual(i, 0, v), vPc(v, c),

ret(i, vret), cast(_, tc, vret), subtype(t, tc),

unresolved(ic), vPc(vc, c), ret(ic, vc).

Information propagates both forward and backward—for example, casting the result

of a call to Class.newInstance constrains the Class object it is called upon. If the same

Class object is used in another part of the program, the type constraint derived from

the cast will be obeyed.

Problems with Using Casts

Casts are sometimes inadequate for resolving calls to Class.newInstance for the fol-

lowing reasons. First, the cast-based approach is inherently imprecise because

programs often cast the result of Class.newInstance to a very wide type such as

java.io.Serializable. This produces a lot of potential subclasses, only some of

which are relevant in practice. Second, as our experiments show, not all calls to

Class.newInstance have post-dominating casts, as illustrated by the following exam-

ple.

Example 9 As shown in Figure 5.3, one of our benchmark applications, freetts,

places the object returned by Class.newInstance into a vector voiceDirectories (line 5).

Despite the fact that the objects are subsequently cast to type VoiceDirectory[] on

line 8, intraprocedural post-dominance is not powerful enough to take this cast into

account. 2

Using cast information significantly reduces the need for user-provided specifica-

tion in practice. While the version of the analysis that does not use cast information

can be made fully sound with user specification as well, we chose to only provide a


1. UniqueVector voiceDirectories = new UniqueVector();2. for (int i = 0; i < voiceDirectoryNames.size(); i++) {3. Class c = Class.forName((String) voiceDirectoryNames.get(i),4. true, classLoader);5. voiceDirectories.add(c.newInstance());6. }7.8. return (VoiceDirectory[]) voiceDirectories.toArray(new9. VoiceDirectory[voiceDirectories.size()]);

Figure 5.3: A case in freetts where our analysis is unable to determine the type ofobjects instantiated on line 5 using casts.

specification for the cast-based version. A comprehensive examination of the results

of experiments with this analysis can be found in Section 6.3.

5.3 Static Race Detection

The next application we describe is a sophisticated static analysis for finding data

races in Java programs, as developed by Naik[177]. A data race occurs when two

threads can access the same memory location without ordering constraints enforced

between them and at least one of the accesses is a write.1 Static race detection is

a difficult yet important problem. Because they are nondeterministic, data races

are notoriously difficult to catch and fix. Because of the complexity of static race

detection, most race checkers are dynamic, which means they cannot find all data

races, nor prove that a program is free of data races.

Race detection for Java is especially difficult because Java’s object-oriented nature

and the fact locks are associated with objects means that the analysis must track

objects precisely in order to get acceptably accurate results. We have found that

a precise pointer analysis is absolutely essential for the race detection algorithm to

be practical. The static race checker builds on the pointer analysis with k-object

sensitivity described in Section 3.4. Experimentally, we found k = 3 to be a good

tradeoff of performance versus accuracy.

1Note that the absence of data races is neither necessary nor sufficient to ensure the absence ofmultithreading bugs; some data races are innocuous and a lack of data races does not guaranteeatomicity[89].


The analysis is a five step algorithm, where a candidate set of races is identified

and then refined by successive stages. Using bddbddb made it much easier to develop

such a sophisticated analysis quickly. The problem formulation in Datalog is quite

simple. Each of the stages has a straightforward implementation in Datalog. bddbddb

is able to efficiently represent a large set of potential races and so the implementation

is efficient; it is able to scale up to very large programs with hundreds of thousands

of lines of code and millions of potential races.

The best feature of the algorithm, however, is the fact that it can generate coun-

terexample traces that show the existence of a data race. The counterexample traces

fall out directly from our technique of context sensitivity through cloning of call paths.

As each call path is explicitly and separately represented in the analysis, we can find

all call paths that can exhibit the data race by making a simple query in Datalog.

5.3.1 Race Detection Algorithm Overview

The race detection algorithm is consists of the following stages:

1. original-pairs computation,

2. reachable-pairs computation,

3. aliasing-pairs computation,

4. escaping-pairs computation, and

5. unlocked-pairs computation.

The first stage generates a candidate set of pairs of memory accesses potentially

involved in a race, and the subsequent stages successively refine that initial over-

approximation. The refinement stages rely on four different types of information:

call-graph construction, pointer analysis, thread-escape analysis, and lock analysis.

We now go through each analysis stage and show its implementation in Datalog.

5.3.2 Original-Pairs Computation

A data race is defined as the situation when two threads access the same memory

location without ordering constraints and at least one of the accesses is a write. Thus,


one very rough overapproximation of the set of all data races in a Java program is

to consider all pairs of accesses to the same field as potential data races. This is safe

because Java’s strong typing guarantees that the only way to access a field is through

its name. Likewise, we consider all pairs of array accesses as potential data races.

Algorithm 10 Original pairs for static race detection.


Q 1048576 quad.map

Relations

StoreQuad (quad : Q, base : V, field : F, source : V) inputLoadQuad (quad : Q, dest : V, base : V, field : F) inputOriginalPairs (quad1 : Q, quad2 : Q) output

Rules

OriginalPairs(q1, q2) : – StoreQuad(q1, , f, ),LoadQuad(q2, , , f). (5.12)

OriginalPairs(q1, q2) : – StoreQuad(q1, , f, ), StoreQuad(q2, , f, ). (5.13)

2

Algorithm 10 shows how to implement this in Datalog. It uses the following

domains and relations:

Q is the domain of all statements in the program. Each statement is given a

unique number. The Q comes from Quad, the basic instruction unit in the

Joeq compiler[247].

StoreQuad : Q × V × F × V contains all of the store instructions in the program.

StoreQuad(q,v1,f,v2) says that statement q is “v1.f = v2;”. This is like the store

relation but includes the program location for reporting purposes.

LoadQuad : Q × V × V × F contains all of the load instructions in the program.

LoadQuad(q,v1,v2,f) says that statement q is “v1 = v2.f;”.

OriginalPairs : Q × Q gives the set of pairs of potential races.


Given these input relations, generating the initial candidate set of races is just

two simple Datalog rules. The first matches cases when one statement is a store and

the other is a load. The second matches cases where both statements are stores.

5.3.3 Reachable-Pairs Computation

The set of pairs generated by Algorithm 10 is extremely imprecise. We can refine

it somewhat by using call graph information; statements that are not in the call

graph obviously can not be data races. Furthermore, we ignore statements in class

initializers, constructors, and finalizers as they typically lack synchronization and

almost never cause data races.

Algorithm 11 gives a technique for refining the set of possible races to only include

the methods that are reachable from the main method without going through a con-

structor. It uses the fact that a pair of accesses may be involved in a race only if each

access is reachable from a thread-spawning call site that is itself reachable from the

main method of the program. This algorithm assumes a context-sensitive call graph

relation as input, such as one generated by Algorithm 6 in Section 3.4.

MethodQuad : M × Q gives a mapping between a method and the quads contained

in that method. MethodQuad(m, q) means that method m contains quad q.

MethodContext : O × M is a computed relation that keeps track of the set of

reachable methods and their contexts. Recall that we use object-sensitive

analysis, so the context has domain O, which is the domain of object names.

MethodContext(c, m) means that method m under context c is reachable from

main without going through an object initializer.

QuadContext : O × Q is like MethodContext , but for quads. QuadContext(c, q)

means that quad q under context c is reachable under the same restrictions as

with MethodContext .

ReachablePairs : O × Q × O × Q is the result of this stage of the analysis.

It gives the set of pairs of potential races with added context information.


Algorithm 11 Reachable pairs for static race detection.


Relations

OriginalPairs (quad1 : Q, quad2 : Q) inputmI (method : M, invoke : I, name : N) inputIE c (caller : O, invoke : I, callee : O, tgt : M) inputMethodQuad (quad : M, quad : Q) inputMethodContext (context : O,method : M)QuadContext (context : O, quad : Q)ReachablePairs (c1 : O, q1 : Q, c2 : O, q2 : Q) output

Rules

MethodContext(0, “main”). (5.14)

MethodContext(c2, m2) : – MethodContext(c1, m1),mI (m1, i, ),IE c(c1, i, c2, m2), m2 6= “∗. < init >”,m2 6= “Thread.start”. (5.15)

MethodContext(c, m2) : – IE c( , , c, m1), m1 = “Thread.start”,m2 = “Thread.run”. (5.16)

QuadContext(c, q) : – MethodContext(c, m),MethodQuad(m, q).(5.17)

ReachablePairs(c1, q1, c2, q2) : – OriginalPairs(q1, q2),QuadContext(c1, q1),QuadContext(c2, q2). (5.18)

2


ReachablePairs(c1, q1, c2, q2) says that statement q1 under context c1 may be a

data race with statement q2 under context c2.

Rules (5.14) and (5.15) are the base case and inductive step, respectively, of finding

the set of methods and contexts that are reachable from main without going through an

object initializer. Rule (5.16) handles thread spawn sites by adding the thread entry

point to the set of reachable methods. Rule (5.17) translates between a reachable

method and the quads it contains, and Rule (5.18) computes the refined set of pairs

using OriginalPairs and the reachable set.

5.3.4 Aliasing-Pairs Computation

The next stage of the algorithm prunes ReachablePairs using the fact that a pair of

accesses may only be involved in a race if they access the same location. A pair of

instance field accesses x.f and y.f, or a pair of array element accesses x[i] and y[i], can

refer to the same location only if x and y can refer to the same object. Therefore, we

can use points-to information to further refine the set of potential data race pairs.

We use the pointer analysis with k-object sensitivity as described in Section 3.4

to generate the points-to information. Algorithm 12 contains only two rules. The

first rule matches pairs of store and load instructions where the base objects are

potentially aliased. The second rule handles the same situation for pairs of two

store instructions. Note that this algorithm takes advantage of the precision of the

context-sensitive pointer information by also matching the contexts in addition to the

variables.

5.3.5 Escaping-Pairs Computation

The next stage of our algorithm prunes AliasingPairs using the fact that a pair of

accesses may be involved in a race only if they access thread-shared data. We use

a simple thread-escape analysis to find the set of objects that can be accessed by

multiple threads.

escObjects : O is the set of thread-escaping objects; i.e. objects that are possibly


Algorithm 12 Aliasing pairs for static race detection.



ReachablePairs (c1 : O, q1 : Q, c2 : O, q2 : Q) inputvPo (context : O, variable : V, object : O) inputAliasingPairs (c1 : O, q1 : Q, c2 : O, q2 : Q) output

Rules

AliasingPairs(c1, q1, c2, q2) : – ReachablePairs(c1, q1, c2, q2),StoreQuad(q1, v1, , ),LoadQuad(q2, , v2, ),vPo(c1, v1, o), vPo(c2, v2, o). (5.19)

AliasingPairs(c1, q1, c2, q2) : – ReachablePairs(c1, q1, c2, q2),StoreQuad(q1, v1, , ), StoreQuad(q2, v2, , ),vPo(c1, v1, o), vPo(c2, v2, o). (5.20)

2


Algorithm 13 Escaping pairs for static race detection.



AliasingPairs (c1 : O, q1 : Q, c2 : O, q2 : Q) inputhP (base : O, field : F, target : O) inputIE c (caller : O, invoke : I, callee : O, tgt : M) inputactual (invoke : I, param : Z, var : V) inputvPo (context : O, variable : V, object : O) inputescObjects (base : O)EscapingPairs (c1 : O, q1 : Q, c2 : O, q2 : Q) output

Rules

escObjects(o) : – hP(0, , o). (5.21)

escObjects(o) : – IE c(c, i, , “Thread.start”), actual(i, , v),vPo(c, v, o). (5.22)

escObjects(o2) : – escObjects(o1), hP(o1, , o2). (5.23)

EscapingPairs(c1, q1, c2, q2) : – AliasingPairs(c1, q1, c2, q2),StoreQuad(q1, v1, , ),LoadQuad(q2, , v2, ),vPo(c1, v1, o), escObjects(o), vPo(c2, v2, o). (5.24)

EscapingPairs(c1, q1, c2, q2) : – AliasingPairs(c1, q1, c2, q2),StoreQuad(q1, v1, , ), StoreQuad(q2, v2, , ),vPo(c1, v1, o), escObjects(o), vPo(c2, v2, o). (5.25)

2


accessible by multiple threads. This includes all objects in static fields, all

objects passed into a thread constructor, and all objects transitively reachable

from other thread-escaping objects.

Rule (5.21) says that all objects reachable from the “global” object are escaping.

As all static fields are considered to be fields of the “global” object, this captures

all objects stored in static fields. Rule (5.22) says that all objects passed into the

thread start method are escaping. Rule (5.23) implements transitive closure — if an

object is escaping, then dereferences of that object are also escaping. Rules (5.24)

and (5.25) translate between objects and pairs of accesses, and use these to filter the

aliasing-pairs set.

5.3.6 Unlocked-Pairs Computation

Notice that none of the previous stages of the algorithm check for the presence of

locks — they simply look for pairs of accesses that may access the same location.

The final stage of our algorithm prunes the escaping-pairs set using the fact that

a pair of accesses may be involved in a race only if they are executed by a pair of

threads without holding a common lock.

The rules calculate the set of abstract objects that are locked by a given thread

at a given quad under a given context. Note that we say abstract objects. Because a

single abstract object may represent multiple concrete objects, we cannot determine

whether the object is unique and therefore the unlocked-pairs computation is unsound.

Fundamentally, the problem is that to determine whether a common lock is held

requires a must-alias analysis whereas we use a may-alias analysis. However, in our

experiments we did not find any case in which locks were held on different concrete

objects that were represented by the same abstract object in our k-object sensitive

analysis. A more complete description of the sources of unsoundness in the race

detection algorithm can be found in another publication[177].

We also filter accesses that appear to come from the same thread, as such accesses

may be to thread-local data. As described in Section 5.1.9, by making two copies


of every thread we can determine which accesses are truly thread-local versus which

accesses can be shared with other threads that are created at the same creation site.

SyncMethod : M is the set of methods that are marked as synchronized. A synchro-

nized method locks the receiver object on method entry and unlocks on method

exit. Note that the actual implementation of the race checker also handles syn-

chronized statements; to make the presentation clearer, we did not include it

here as it requires extra information about statement nesting.

mayLocked : O × O × M × O is a computed relation of the set of abstract objects

that may be locked when a method is executed under some context in some

thread.

mayUnlocked : O × O × M × O is a computed relation of the set of abstract objects

that may be unlocked when a method is executed under some context in some

thread.

mustLocked : O × O × M × O is a computed relation of the set of abstract objects

that must be locked when a method is executed under some context in some

thread.

matched : O × O × M × O × O × M is a computed relation of pairs of methods

(with contexts and threads) where they must hold a lock on a common abstract

object.

Algorithm 14 is the most complicated of the race detection algorithm phases, but

it still can be expressed in only nine Datalog rules. Rules (5.26) and (5.27) compute

the set of methods and contexts that are reachable by each thread. Rules (5.28)

and (5.29) are the base case and inductive step, respectively, of the computation

of mayLocked , which determines what abstract objects may be locked in a method

under a given thread and context. mayLocked is negated in Rule (5.30) to obtain the

abstract objects that must not be locked in a method. Rule (5.31) adds a transitive

closure step to make this into the abstract objects that may not be locked in a method.

Negating this again in Rule (5.32) gives the objects that must be locked. Rule (5.33)


Algorithm 14 Unlocked pairs for static race detection.



EscapingPairs (c1 : O, q1 : Q, c2 : O, q2 : Q) inputSyncMethod (method : M) inputformal (method : M, param : Z, var : V) inputvPo (context : O, variable : V, object : O) inputthread (thread : O, context : O,method : M)mayLocked (thread : O, context : O,method : M, object : O)mayUnlocked (thread : O, context : O,method : M, object : O)mustLocked (thread : O, context : O,method : M, object : O)matched (t1 : O, c1 : O,m1 : M, t2 : O, c2 : O,m2 : M)UnlockedPairs (c1 : O, q1 : Q, c2 : O, q2 : Q) output

Rules

thread(t, c, m) : – m = “Thread.run”, formal(m, 0, v),vPo(c, v, t). (5.26)

thread(t, c2,m2) : – thread(t, c1,m1),mI (m1, i, ),IEc(c1, i, c2,m2),m2 6= “Thread.start”. (5.27)

mayLocked(t, c, m, o) : – SyncMethod(m), formal(m, 0, v),vPo(c, v, o), thread(t, c, m). (5.28)

mayLocked(t, c2,m2, o) : – mayLocked(t, c1,m1, o),mI (m1, i, ),IEc(c1, i, c2,m2),m2 6= “Thread.start”. (5.29)

mayUnlocked(t, c, m, o) : – ¬mayLocked(t, c, m, o). (5.30)

mayUnlocked(t, c2,m2, o) : – mayUnlocked(t, c1,m1, o),mI (m1, i, ),IEc(c1, i, c2,m2),m2 6= “Thread.start”. (5.31)

mustLocked(t, c, m, o) : – mayLocked(t, c, m, o),¬mayUnlocked(t, c, m, o). (5.32)

matched(t1, c1,m1, t2, c2,m2) : – mustLocked(t1, c1,m1, o),mustLocked(t2, c2,m2, o). (5.33)

UnlockedPairs(c1, q1, c2, q2) : – EscapingPairs(c1, q1, c2, q2),MethodQuad(m1, q1),MethodQuad(m2, q2),¬matched(t1, c1,m1, t2, c2,m2), t1 6= t2. (5.34)

2


finds objects that match and must be locked. Finally, Rule (5.34) filters the escaping-

pairs set from the previous stage, such that pairs that are in methods that must lock

on the same abstract object or are from the same thread are filtered.

The pattern of double negation in Algorithm 14 is a general technique to imple-

ment universal quantification (“for-all”) in Datalog. It is an application of DeMor-

gan’s law. Negating a relation, performing an existential quantification (which is the

semantics of a Datalog rule), and then negating the results achieves the effect of a

universal quantification.

5.3.7 Summary of Static Race Detection Results

In this section we briefly summarize some of the results of the static race detection

algorithm. A comprehensive evaluation can be found in another publication[177].

We tried the static race detection algorithm on twelve open-source multi-threaded

Java programs. The largest benchmark, Apache Derby with 646,447 lines of code,

took 26 minutes to analyze. The other programs only took a few minutes. Our

algorithm found 387 distinct bugs; these are the actual number of distinct fixes that

were needed in the source code to eliminate all harmful races. Most of the bugs have

been confirmed and fixed by the developers. Almost all of the races detected were

actually harmful races; only a handful were benign races or false positives due to

analysis inaccuracy.

5.4 Program Analyses using bddbddb

Since bddbddb was released, quite a few people have picked it up and are using it

for their own program analyses or other purposes. The fact that many people have

picked up bddbddb and have used it to implement interesting analyses is a tribute to

the ease of use and utility of bddbddb and of specifying program analyses in Datalog.

This section lists some of the uses of bddbddb that we are aware of.

• Livshits et al. used a static analysis with bddbddb to find Java security vul-

nerabilities such as SQL injections, cross-site scripting, and HTTP splitting


attacks[160]. In this application, accurate pointer analysis information is criti-

cal as the vulnerabilities involve the relationship between multiple objects.

• Brumley et al. used Datalog and bddbddb to implement an alias analysis for

assembly language[40]. It is able to disassemble x86 binary code, translate it

into an IR and from there into Datalog relations, which are fed to bddbddb.

• Avots et al. implemented a context-sensitive, inclusion-based, field-sensitive C

pointer analysis using bddbddb and the cloning technique described in Sec-

tion 3.3.1[16]. They propose two models: a conservative analysis that makes

very few assumptions about the input program but gives less accurate results,

and an optimistic analysis that assumes some level of type safety but can give

more accurate results. They use bddbddb to reduce the overhead of a dynamic

string-buffer overflow detector by 30-100% and statically found six format string

vulnerabilities.

• Livshits et al. used bddbddb and pointer analysis to statically approximate the

targets of reflective calls as part of call graph construction[157, 158]. This

algorithm also automatically discovered all the places where user specification

was necessary to fully resolve reflective targets. It was effective in resolving

most reflective calls without user input.

• Martin et al. developed a language called PQL (Program Query Language) that

allows programmers to specify rules of how objects should interact[163]. They

translate PQL into Datalog queries and use bddbddb and the context-sensitive

pointer analysis from Section 3.3.2 to statically find all potential matches to

the query. PQL also includes a dynamic analyzer to find matches dynami-

cally; the static analysis from bddbddb is also useful in reducing the number of

instrumentation points and thereby the overhead of the dynamic analysis.

• bddshell is a front-end to bddbddb and the Joeq compiler to make them eas-

ier to use for software vulnerability detection[141]. It simplifies the interface

and automates the multi-step process of extracting relations from the program,

running a security analysis, and querying the results.


• Naik et al. used bddbddb to implement an algorithm for static race

detection[177]. This algorithm was presented in Section 5.3. bddbddb made the

formulation of the multi-stage analysis very easy and greatly simplified the im-

plementation. Naik et al. later improved the accuracy of the analysis by adding

conditional must not aliasing[176] while using the same general algorithm and

framework.

• Kissinger et al. built a system called Lopol, which uses bddbddb to analyze

security policies for Security-Enhanced Linux[140]. They translate the 100K-

line SELinux policy into a Datalog program that is fed to bddbddb. The use of

Datalog and the query mode of bddbddb made it easy to make Lopol lightweight,

interactive, and iterative.

• Bierhoff used bddbddb to implement a tainted analysis for Java strings[33]. His

implementation is only seven Datalog rules.

• Gallagher et al. used bddbddb to construct an abstract interpretation of a logic

program by building a preinterpretation[95]. By using bddbddb and the power

of the BDD data structure, they were able to scale up to much larger programs

than were previously possible.

• Zhang et al. used bddbddb to perform automatic dependence analysis to com-

pute a test plan[268]. They propagate semantic dependencies at a method-level

granularity and get better accuracy than by using the traditional object rela-

tion diagram. They use bddbddb to compute points-to and data reachability

information.

• Zhang et al. used bddbddb again in modeling library callbacks for construct-

ing accurate application call graphs[267]. They designed a data reachability

algorithm and fine-tuned it to perform well in resolving library callbacks accu-

rately. Their technique was able to significantly improve call graph accuracy

when analyzing the SpecJVM benchmarks.

• Prabhu et al. used bddbddb to develop a flow and context-sensitive combined


pointer and escape analysis for Java[190]. It works by extending the Heap

Array SSA form. The analysis is specified in only a few rules of Datalog,

and their implementation runs faster than a flow-insensitive, field-insensitive

analysis while also being 21% to 79% more accurate.

• Tomb et al. experimented with bddbddb in building a constraint-based type in-

ference algorithm based on type checking rules[236]. Their approach partially

evaluates the type checking rules with respect to the target program and gen-

erates a set of constraints in the form of a Datalog program, which can be fed

to bddbddb to solve the type inference.

• Henriksen et al. used bddbddb for type analysis of logic programs[121]. They

built a web-based tool that interfaces with various logic programming systems

including bddbddb.

• Henriksen et al. also used bddbddb in the abstract interpretation of low-level

machine programs[122]. They built an emulator for a small PIC microcontroller

in Prolog and used bddbddb to obtain constraints on and linear relations between

data registers, enabling detection of e.g. overflows.

• Hammer used bddbddb with a parallelization analysis for slicing of Java

threads[114]. By using bddbddb, the analysis was able to be context-sensitive

and scale up to larger programs than with previous techniques. The Datalog

formulation was also quite simple.

• Gallagher used bddbddb to perform program analyses and specializations based

on tree automata[96]. He used a tool to translate abstract logic programs into

Datalog and then used bddbddb to solve them.

• Ishio et al. used bddbddb in a program slicing tool that uses heuristics to stop

visiting vertices in a program dependence graph, so as to not inundate the

user with too much information[131]. They use bddbddb to generate points-to

information for the slicing tool.


• Ryan used bddbddb in a static analysis for efficiently checkpointing a running

Java virtual machine[210]. His technique is implemented in the Joeq virtual

machine[247].

In addition to the list above, a few universities have started using bddbddb in their

courses to help teach program analysis concepts or to expose students to program

analysis tools so they can build their own analyses.

Chapter 6

Experimental Results

6.1 Pointer Analysis Results

Name Descriptionfreetts speech synthesis systemnfcchat scalable, distributed chat clientjetty HTTP Server and Servlet containeropenwfe Java workflow enginejoone Java neural net frameworkjboss J2EE application serverjbossdep J2EE deployersshdaemon SSH daemonpmd Java source code analyzerazureus Java bittorrent clientfreenet anonymous peer-to-peer file sharing systemsshterm SSH terminaljgraph mathematical graph-theory objects and algorithmsumldot makes UML class diagrams from Java codejbidwatch auction site bidding, sniping, and tracking toolcolumba graphical email client with internationalizationgantt plan projects using Gantt chartsjxplorer ldap browserjedit programmer’s text editormegamek networked BattleTech gamegruntspud graphical CVS client

Table 6.1: Descriptions of the benchmarks we used to evaluate our pointer analyses.

117

CHAPTER 6. EXPERIMENTAL RESULTS 118

Name Classes Methods Bytecodes Vars Allocs C.S. Pathsfreetts 215 723 48K 8K 3K 4 × 104

nfcchat 283 993 61K 11K 3K 8 × 106

jetty 309 1160 66K 12K 3K 9 × 105

openwfe 337 1215 74K 14K 4K 3 × 106

joone 375 1531 92K 17K 4K 1 × 107

jboss 348 1554 104K 17K 4K 3 × 108

jbossdep 431 1924 119K 21K 5K 4 × 108

sshdaemon 485 2053 115K 24K 5K 4 × 109

pmd 394 1971 140K 19K 4K 5 × 1023

azureus 498 2714 167K 24K 5K 2 × 109

freenet 667 3200 210K 38K 8K 2 × 107

sshterm 808 4059 241K 42K 8K 5 × 1011

jgraph 1041 5753 337K 59K 10K 1 × 1011

umldot 1189 6505 362K 65K 11K 3 × 1014

jbidwatch 1474 8262 489K 90K 16K 7 × 1013

columba 2020 10574 572K 111K 19K 1 × 1013

gantt 1834 10487 597K 117K 20K 1 × 1013

jxplorer 1927 10702 645K 133K 22K 2 × 109

jedit 1788 10934 667K 124K 20K 6 × 107

megamek 1265 8970 668K 123K 21K 4 × 1014

gruntspud 2277 12846 687K 145K 24K 2 × 109

Table 6.2: Information about the benchmarks we used to test our pointer analyses.

In this section, we present some experimental results of using bddbddb on the

pointer analysis algorithms presented in Chapter 3. We describe our testing method-

ology and benchmarks, present the analysis times, evaluate the results of the analyses,

and provide some insight on our experience of developing these analyses and the bd-

dbddb tool.

6.1.1 Methodology

The input to bddbddb is more or less the Datalog programs exactly as they are pre-

sented in this dissertation. (We added a few rules to handle return values and threads,

and added annotations for the physical domain assignments of input relations.) The

input relations were generated with the Joeq compiler infrastructure[247]. The entire

bddbddb implementation is about 10,000 lines of code. bddbddb uses the JavaBDD


library[248], an open-source library based on the BuDDy library[153]. The entire

system is available as open-source[256], and we hope that others will find it useful.

All experiments in this section were performed on a 2.2GHz Pentium 4 with Sun

JDK 1.4.2 04 running on Fedora Linux Core 1. For the context-insensitive and

context-sensitive experiments, respectively: we used initial BDD table sizes of 4M

and 12M; the tables could grow by 1M and 3M after each garbage collection; the

BDD operation cache sizes were 1M and 3M.

To test the scalability and applicability of the algorithm, we applied our technique

to 21 of the most popular Java projects on Sourceforge as of November 2003. We

simply walked down the list of 100% Java projects sorted by activity, selecting the

ones that would compile directly as standalone applications. They are all real appli-

cations with tens of thousands of users each. As far as we know, these are the largest

benchmarks ever reported for any context-sensitive Java pointer analysis. As a point

of comparison, the largest benchmark in the specjvm suite, javac, would rank only

13th in our list.

For each application, we chose an applicable main() method as the entry point to

the application. We included all class initializers, thread run methods, and finalizers.

We ignored null constants in the analysis—every points-to set is automatically as-

sumed to include null. Exception objects of the same type were merged. We treated

reflection and native methods as returning unknown objects. Some native methods

and special fields were modeled explicitly.

A short description of each of the benchmarks is included in Table 6.1, along with

their vital statistics in Table 6.2. The number of classes, methods, and bytecodes

were those discovered by the context-insensitive on-the-fly call graph construction

algorithm, so they include only the reachable parts of the program and the class

library.

The number of context-sensitive (C.S.) paths is for the most part correlated to

the number of methods in the program, with the exception of pmd. pmd has an

astounding 5× 1023 paths in the call graph, which requires 79 bits to represent. pmd

has different characteristics because it contains code generated by the parser gener-

ator JavaCC. Many machine-generated methods call the same class library routines,


leading to a particularly egregious exponential blowup. The JavaBDD library only

supports physical domains up to 63 bits; contexts numbered beyond 263 were merged

into a single context. The large number of paths also caused the algorithm to require

many more rule applications to reach a fix-point solution.

6.1.2 Analysis Times

We measured the analysis times and memory usage for each of the algorithms pre-

sented in Chapter 3 (Table 6.3). The algorithm with call graph discovery, in each

iteration, computes a call graph based on the points-to relations from the previous

iteration. The number of iterations taken for that algorithm is also included here.

All timings reported are wall-clock times from a cold start, and include the various

overheads for Java garbage collection, BDD garbage collection, growing the node

table, etc. The memory numbers reported are the sizes of the peak number of live

BDD nodes during the course of the algorithm. We measured peak BDD memory

usage by setting the initial table size and maximum table size increase to 1MB, and

only allowed the table to grow if the node table was more than 99% full after a garbage

collection.1

The context-insensitive analyses (Algorithms 1 and 2) are remarkably fast; the

type-filtering version was able to complete in under 45 seconds on all benchmarks. It

is interesting to notice that introducing type filtering actually improved the analysis

time and memory usage. Along with being more accurate, the points-to sets are much

smaller in the type-filtered version, leading to faster analysis times.

For Algorithm 3, the call graph discovery sometimes took over 40 iterations to

complete, but it was very effective in reducing the size of the call graph as compared

to CHA[151]. The complexity of the call graph discovery algorithm seems to vary

with the number of virtual call sites that need resolving—jedit and megamek have

many methods declared as final, but jxplorer has none, leading to more call targets

to resolve and longer analysis times.

The analysis times and memory usages of our context-sensitive pointer analysis

1To avoid garbage collections, it is recommended to use more memory. Our timing runs use thedefault setting of 80%.


Con

text

-ins

ensi

tive

poin

ters

Con

text

-sen

siti

veT

hrea

d-se

nsit

ive

Nam

eno

type

filte

rw

ith

type

filte

rw

ith

cgdi

scov

ery

poin

ter

anal

ysis

type

anal

ysis

poin

ter

anal

ysis

tim

em

emti

me

mem

iter

tim

em

emti

me

mem

tim

em

emti

me

mem

free

tts

13

13

202

41

61

61

4nf

ccha

t1

41

423

46

212

212

16

jett

y1

51

522

47

312

210

16

open

wfe

15

16

235

84

142

141

7jo

one

27

17

247

104

183

181

9jb

oss

27

27

308

107

244

222

9jb

ossd

ep2

92

926

712

930

526

311

sshd

aem

on2

92

1026

1314

1234

628

313

pmd

17

17

339

1029

711

119

361

9az

ureu

s2

102

1029

1315

932

630

212

free

net

716

516

4041

2321

3810

326

21ss

hter

m8

175

1731

3725

5086

1860

723

jgra

ph17

2711

2542

7837

119

134

3320

1335

umld

ot17

3011

2934

9743

457

304

6313

016

41jb

idw

atch

3143

2040

3214

958

580

394

6814

025

56co

lum

ba43

5527

4942

273

7380

740

012

317

838

72ga

ntt

4159

2651

3926

176

1122

632

113

174

3471

jxpl

orer

5768

3960

4139

088

337

198

7811

851

83je

dit

6161

3854

3727

880

113

108

6082

5076

meg

amek

4057

2651

3420

176

1101

600

100

224

3473

grun

tspu

d66

7641

6735

389

9931

220

286

130

5895

Tab

le6.

3:A

nal

ysi

sti

mes

and

pea

km

emor

yusa

ges

for

each

ofth

eb

ench

mar

ks

and

anal

yse

s.T

ime

isin

seco

nds

and

mem

ory

isin

meg

abyte

s.


(Algorithm 5) were, on the whole, very reasonable. It can analyze most of the small

and medium size benchmarks in a few minutes, and it successfully finishes analyzing

even the largest benchmarks in under 19 minutes. This is rather remarkable con-

sidering that the context-sensitive formulation is solving up to 1014 times as many

relations as the context-insensitive version! Our scheme of numbering the contexts

consecutively allows the BDD to efficiently represent the similarities between calling

contexts. The analysis times are most directly correlated to the number of paths

in the call graph. From the experimental data presented here, it appears that the

analysis time of the context-sensitive algorithm scales approximately with O(lg2 n)

where n is the number of paths in the call graph; more experiments are necessary to

determine if this trend persists across more programs.

The context-sensitive type analysis (Algorithm 8) is, as expected, quite a bit faster

and less memory-intensive than the context-sensitive pointer analysis. Even though

it uses the same number of contexts, it is an order of magnitude faster than the

context-sensitive pointer analysis. This is because in the type analysis the number

of objects that can be pointed to is much smaller, which greatly increases sharing

in the BDD. The thread-sensitive pointer analysis (Algorithm 9) has analysis times

and memory usages that are roughly comparable to those of the context-insensitive

pointer analysis, even though it includes thread context information. This is because

the number of thread creation sites is relatively small, and we use at most two contexts

per thread.

6.1.3 Evaluation of Results

To evaluate the accuracy of the analyses, we show the results of two specific queries:

thread escape analysis (Section 5.1.9) and type refinement (Section 5.1.5).

The results of the escape analysis are shown in Figure 6.4. The first two columns

give the number of captured and escaped object creation sites, respectively. The next

two columns give the number of unneeded and needed synchronization operations.

The single-threaded benchmarks have only one escaped object: the global object from

which static variables are accessed. In the multi-threaded benchmarks, the analysis


is effective in finding 30-50% of the allocation sites to be captured, and 15-30% of

the synchronization operations to be unnecessary. These are static numbers; to fully

evaluate the results would require dynamic execution counts, which is outside of the

scope of this dissertation.

The results of the type refinement query are shown in Table 6.5. We tested the

query across six different analysis variations. From left to right, they are context-

insensitive pointer analysis without and with type filtering, context-sensitive pointer

analysis and context-sensitive type analysis with the context projected away, and

context-sensitive pointer and type analysis on the fully cloned graph. Projecting

away the context in a context-sensitive analysis makes the result context-insensitive;

however, it can still be more precise than context-insensitive analysis because of the

extra precision at the intermediate steps of the analysis. We measured the percentages

of variables that can point to multiple types and variables whose types can be refined.

Including the type filtering makes the algorithm strictly more precise. Likewise,

the context-sensitive pointer analysis is strictly more precise than both the context-

insensitive pointer analysis and the context-sensitive type analysis. We can see this

trend in the results. As the precision increases, the percentage of multi-typed variables

drops and the percentage of refinable variables increases. The context-insensitive

pointer analysis and the context-sensitive type analysis are not directly comparable;

in some cases the pointers are more precise, in other cases the context-sensitive types

are more precise.

When we do not project away the context, the context-sensitive results are re-

markably precise—the percentage of multi-typed variables is never greater than 1%

for the pointer analysis and 2% for the type analysis. Projecting away the context

loses much of the benefit of context sensitivity, but is still noticeably more precise

than using a context-insensitive analysis.

6.1.4 Experience

All the experimental results reported here are generated using bddbddb. At the early

stages of our research, we hand-coded every pointer analysis using BDD operations


heap objects sync operationsName captured escaped ¬needed neededfreetts 2349 1 43 0nfcchat 1845 2369 52 46jetty 2059 2408 47 89openwfe 3275 1 57 0joone 1640 1908 34 75jboss 3455 2836 112 105jbossdep 1838 2298 32 94sshdaemon 12822 22669 468 1244pmd 3428 1 47 0azureus 8131 9183 226 229freenet 5078 9737 167 309sshterm 16118 24483 767 3642jgraph 25588 48356 1078 5124umldot 38930 69332 2146 8785jbidwatch 97234 143384 2243 11438columba 111578 174329 3334 18223gantt 106814 156752 2377 11037jxplorer 188192 376927 4127 18904jedit 446896 593847 7132 36832megamek 179221 353096 3846 22326gruntspud 248426 497971 5902 25568

Table 6.4: Results of escape analysis.

directly and spent a considerable amount of time tuning their performance. Our

context-numbering scheme is the reason why the analysis would finish at all on even

small programs. Every one of the optimizations described in Chapter 4 was first

carried out manually. After considerable effort, megamek still took over three hours

to analyze, and jxplorer did not complete at all. The incrementalization was very

difficult to get correct, and we found a subtle bug months after the implementation

was completed. We did not incrementalize the outermost loops as it would have been

too tedious and error-prone. It was also difficult to experiment with different rule

application orders.

To get even better performance, and more importantly to make it easier to develop

new queries and analyses, we created bddbddb. We automated and extended the

optimizations we have used in our manual implementation, and implemented a few


Con

text

-ins

ensi

tive

poin

ters

Pro

ject

edco

ntex

t-se

nsit

ive

Con

text

-sen

siti

veN

ame

noty

pefil

ter

wit

hty

pefil

ter

poin

ter

anal

ysis

type

anal

ysis

poin

ter

anal

ysis

type

anal

ysis

mul

tire

fine

mul

tire

fine

mul

tire

fine

mul

tire

fine

mul

tire

fine

mul

tire

fine

free

tts

5.1

41.1

2.3

41.6

2.0

41.9

2.5

41.3

0.1

44.4

0.3

44.0

nfcc

hat

12.4

36.4

8.6

37.0

8.2

37.4

8.6

36.9

0.1

45.9

0.7

45.3

jett

y12

.636

.27.

737

.17.

337

.47.

737

.10.

145

.40.

644

.8op

enw

fe12

.136

.97.

037

.76.

638

.07.

037

.60.

145

.50.

544

.8jo

one

11.9

37.5

6.8

38.1

6.4

38.4

6.7

38.1

0.1

45.8

0.5

45.0

jbos

s13

.437

.87.

938

.77.

439

.37.

838

.70.

147

.30.

746

.4jb

ossd

ep10

.240

.37.

439

.57.

040

.07.

539

.40.

247

.60.

846

.6ss

hdae

mon

10.7

39.3

6.0

40.3

5.8

40.5

5.9

40.4

0.1

46.8

0.6

46.1

pmd

9.6

42.3

6.2

43.1

5.9

43.4

6.2

43.1

0.1

52.1

0.6

48.1

azur

eus

10.0

43.6

6.1

44.1

6.0

44.3

6.2

44.1

0.1

50.8

0.9

49.7

free

net

12.1

39.1

6.3

40.0

5.9

40.5

6.3

40.1

0.1

47.0

0.8

46.0

ssht

erm

14.7

40.8

8.9

42.0

8.5

42.5

9.0

42.1

0.6

51.3

1.6

49.9

jgra

ph16

.143

.29.

645

.19.

345

.49.

745

.20.

754

.71.

953

.2um

ldot

15.7

42.3

9.4

43.6

9.0

43.9

9.4

43.6

0.6

53.0

2.0

51.2

jbid

wat

ch14

.942

.38.

643

.48.

243

.78.

643

.40.

652

.01.

750

.5co

lum

ba15

.742

.39.

043

.78.

644

.18.

943

.90.

652

.41.

851

.0ga

ntt

15.0

43.4

8.2

44.7

7.9

45.0

8.2

44.7

0.5

53.0

1.7

51.4

jxpl

orer

15.2

43.1

7.9

44.3

7.7

44.6

8.0

44.4

0.5

52.5

1.6

50.8

jedi

t15

.443

.68.

144

.77.

944

.98.

144

.70.

653

.11.

651

.5m

egam

ek13

.344

.67.

145

.16.

845

.37.

245

.20.

553

.31.

451

.6gr

unts

pud

15.4

44.0

7.7

45.5

7.5

45.7

7.8

45.5

0.5

53.6

1.4

52.1

Tab

le6.

5:R

esult

sof

the

typ

ere

finem

ent

quer

y.N

um

ber

sar

ep

erce

nta

ges.

Col

um

ns

lab

eled

mult

ian

dre

fine

refe

rto

mult

i-ty

pe

vari

able

san

dre

finab

le-t

yp

eva

riab

les,

resp

ecti

vely

.


new ones to empirically choose the best BDD library parameters. The end result is

that code generated by bddbddb outperforms our manually tuned context-sensitive

pointer analysis by as much as an order of magnitude. Even better, we could use

bddbddb to quickly and painlessly develop new analyses that are highly efficient, such

as the type analysis in Section 5.1.8 and the thread escape analysis in Section 5.1.9.

6.2 Effectiveness of bddbddb Compilation

In this section, we measure the effectiveness of our bddbddb system and compare it

to hand-optimized BDD programs. Prior to developing the bddbddb system, we had

manually implemented and optimized three pointer analyses: a context-insensitive

pointer analysis for Java described by Berndl[29], a context-sensitive pointer analysis

based on the cloning of paths in the call graph[253], and a field-sensitive, context-

insensitive pointer analysis for C[16]. We then wrote Datalog versions of these anal-

yses which we ran using the bddbddb system.

The hand-coded Java analyses are the result of months of effort and are well-

tuned and optimized. The variable ordering and physical domain assignment have

been carefully hand-tuned to achieve the best results. Many of the rules in the hand-

coded algorithms were incrementalized. This proved to be a very tedious and error-

prone process, and we did not incrementalize the whole system as it would have been

too unwieldy. Bugs were still popping up weeks after the incrementalization was

completed. bddbddb, on the other hand, happily decomposed and incrementalized

even the largest and most complex inference rules.

Because of the unsafe nature of C, the C pointer analysis is much more compli-

cated, consisting of many more rules. For the hand-coded C pointer analysis, physical

domain assignments, domain variable orderings and the order of inferences were only

optimized to avoid significant execution slowdowns. Specification of low-level BDD

operations was an error-prone, time-consuming process. A good deal of time was

spent modifying physical domain assignments and solving errors due to the incorrect

specification of physical domains in BDD operations. Once the Datalog version of

the analysis was specified, development of the hand-coded version was discontinued,


as it was no longer worth the effort. In the experiment reported here, we compare

the hand-coded version and equivalent Datalog implementation from that time.

We also evaluate the performance of bddbddb on two additional analyses: an

analysis to find external lock objects to aid in finding data races and atomicity bugs,

and an analysis to find SQL injection vulnerabilities in Java web applications[163].

Both of these analyses build on top of the context-sensitive Java pointer analysis, and

both are fairly sophisticated analyses. We do not have hand-coded implementations

of these analyses as they would be too tedious to implement by hand.

6.2.1 Comparing Lines of Code

The first metric for comparison is in the number of lines of code in each algorithm:

Analysis Hand-coded Datalogcontext-insensitive Java 1975 30context-sensitive Java 3451 33context-insensitive C 1363 308external lock analysis n/a 42SQL injection analysis n/a 38

Table 6.6: LOC for hand-coded analyses versus lines of Datalog using bddbddb

Specifying the analysis as Datalog reduced the size of the analysis by 4.4 times in

the case of the C analysis, to over 100 times in the case of the context-sensitive Java

analysis. The disparity between the C and Java implementations is due to the fact

that the C implementation combined many BDD operations on a single line, whereas

the Java implementation put each BDD operation on a separate line of code.

Adding a new analysis with bddbddb takes only a few lines of code versus a rewrite

of thousands of lines for a hand-coded implementation. The external lock analysis

and the SQL injection analysis are examples of this. In another example, we easily

modified the inference rules for a context-insensitive C pointer analysis to create

a context-sensitive analysis by adding an additional context attribute to existing

relations[16]. While this was an extremely simple change to make to the bddbddb

Datalog specification, such a modification would have required rewriting hundreds of

lines of low-level BDD operations in the hand-coded analysis.


6.2.2 Comparing Analysis Times

For each analysis, we compared the solve time for an incrementalized hand-coded

implementation against a bddbddb-based implementation with varying levels of opti-

mization. Analyses were performed on an AMD Opteron 150 with 4GB RAM running

RHEL 3 and Java JDK 5.0. The three bddbddb-based analyses and the hand-coded

Java pointer analysis used the open-source JavaBDD library[248], which internally

makes use of the BuDDy BDD library[153]. The hand-coded C pointer analysis

makes direct use of the BuDDy library. The Java context-insensitive analysis used an

initial node table size of 5M and an operation cache size of 750K. The Java context-

sensitive analysis and C pointer analyses both used an initial node table size of 10M

and an operation cache size of 1.5M.

Tables 6.7, 6.8 and 6.9 contain the run times of our Java context-insensitive analy-

sis, Java context-sensitive analysis, and C pointer analysis, respectively. The first two

columns give the benchmark name and description. The next column gives the solve

time in seconds for the hand-coded solver. The remaining columns give the solve time

when using bddbddb with various optimizations enabled. Each column adds a new

optimization in addition to those used in columns to the left. Under No Opts we

have all optimizations disabled. Under Incr we add incrementalization, as described

in Section 4.2.3. Under +DU we add optimizations based on definition-use chains.

Under +Dom we optimize physical domain assignments. Under +All we add the re-

maining optimizations described in Section 4.2. For the Java context-insensitive and

C pointer analyses, the +Order column shows the result of bddbddb with all opti-

mizations enabled using a variable order discovered by the learning algorithm referred

to in Section 4.3. For our C programs, we used the order learned from enscript. For

the Java programs we used the order learned from joeq. In the Java context-sensitive

case, the learning algorithm was not able to find a better order, so we omitted this

column. Entries marked with a ∞ signified that the test case did not complete due

to running out of memory.

The time spent by bddbddb to translate Datalog to optimized BDD operations is

negligible compared to the solve times, so the translation times have been omitted.

In all cases, bddbddb spent no more than a few seconds to compile the Datalog into


Name Hand- bddbddbcoded No Opts Incr +DU +Dom +All +Order

joeq 7.3 10.0 9.4 7.9 4.8 4.5 3.6jgraph 15.0 25.6 24.1 20.0 11.0 10.4 7.6jbidwatch 26.3 47.4 45.8 35.4 18.6 16.8 13.0jedit 67.0 123.5 119.9 100.0 56.4 45.7 35.7umldot 16.6 29.0 27.4 20.2 11.6 10.9 8.4megamek 35.8 71.3 67.1 57.0 26.8 23.0 17.4

Table 6.7: Comparison of context-insensitive Java pointer analysis runtimes. Timesare in seconds.

Name Hand- bddbddbcoded No Opts Incr +DU +Dom +All

joeq 85.3 323.3 317.8 274.7 124.7 69.7jgraph 118.0 428.1 431.1 362.2 116.3 94.9jbidwatch 421.1 1590.2 1533.3 1324.3 470.6 361.3jedit 147.0 377.2 363.4 293.7 136.4 109.3umldot 402.5 1548.3 1619.3 1362.3 456.5 332.8megamek 1219.2 ∞ ∞ 4306.5 1762.9 858.3

Table 6.8: Comparison of context-sensitive Java pointer analysis runtimes. Times arein seconds.

Name Hand- bddbddbcoded No Opts Incr +DU +Dom +All +Order

crafty 8.7 547.3 525.9 571.7 9.4 8.1 8.2enscript 41.0 1175.4 1211.7 1128.4 122.3 112.6 31.5hypermail 149.4 6263.8 6113.0 5967.1 262.0 231.3 44.2monkey 16.9 468.9 397.7 398.7 33.1 31.3 9.6

Table 6.9: Comparison of C pointer analysis runtimes. Times are in seconds.


Name bddbddbNo Opts Incr +DU +Dom +All

joeq 75.0 60.4 59.3 17.4 15.1jgraph 64.9 51.0 51.1 13.0 12.5jbidwatch 231.0 183.6 203.5 52.3 51.7jedit 20.1 16.3 16.2 5.3 5.1umldot 199.3 162.2 161.3 45.0 39.2megamek 13.3 11.5 10.5 5.1 4.3

Table 6.10: External lock analysis runtimes. Times are in seconds.

Name bddbddbNo Opts Incr +DU +Dom +All +Order

personalblog ∞ 73.0 57.8 25.1 23.1 16.7road2hibernate ∞ 86.4 74.8 49.2 39.7 33.4snipsnap ∞ 227.8 211.9 98.9 84.5 55.8roller ∞ 521.0 479.0 253.7 208.4 185.4

Table 6.11: SQL injection query results. Times are in seconds. ∞ indicates that theanalysis did not finish.

BDD operations.

The unoptimized context-insensitive Java analysis was 1.4 to 2 times slower than

the hand-coded version. Incrementalization showed a very small improvement, but by

adding def-use optimizations, we began to see a useful time reduction to 80% of the

original. Optimizing BDD domain assignments reduces the runtime to about 42% of

the original, and enabling all optimizations further reduces the runtime to about 38%

of the original. Improved variable order brought the runtime between 24% and 36%

of the unoptimized runtime. While incrementalization and def-use optimizations were

sufficient to bring the bddbddb analysis close to the hand-coded analysis runtimes, the

remaining optimizations and learned variable order combined to beat the hand-coded

solver runtime by a factor of 2.

Results for the context-sensitive Java analysis were similar to the context-

insensitive results. Unfortunately, our variable order learning algorithm was unable

to learn a better variable order for this analysis, leaving the fully optimized bddbddb

analysis about 20% faster than the hand-coded version.


In the case of the C analysis, the unoptimized bddbddb analysis was 23 to 60 times

slower than the hand-coded version. This is likely due to the relative complexity

of the Datalog in the C analysis case; optimizations were able to make significant

improvements to the execution times. Analysis times with all optimizations enabled

were roughly comparable to our hand-coded solver. As with the Java analyses, the

largest gain was due to optimized physical domain assignment. When applying the

learned variable order, bddbddb analysis runtimes were reduced even further, to fall

between 30% and 95% of the hand-coded implementation.

6.2.3 External Lock and SQL Injection Analyses

We also used bddbddb to build external lock and SQL injection detection[160] analyses

on top of the Java pointer analysis results. The runtimes for the external lock analysis

using different levels of optimization are displayed in Table 6.10. Incrementalization

reduces the analysis time to about 80% of the original time. Optimizing physical

domain assignments further reduces the analysis time to about 23% of the original.

Table 6.11 displays the runtimes of the SQL injection analysis on four web-based

applications. Without any incrementalization, the analysis fails to complete due to

memory exhaustion. However, with further optimization we see performance gains

similar to those of the external lock analysis.

6.3 Reflection Resolution Results

In this section we present an experimental evaluation of the reflection resolution

analysis presented in Section 5.2.

6.3.1 Experimental Setup

We performed our reflection experiments on a suite of six large, widely-used open-

source Java benchmark applications. These applications were selected among the

most popular Java projects available on Sourceforge. We believe that real-life ap-

plications like these are more representative of how programmers use reflection than


Line File AvailableBenchmark Description count count Jars classesjgap genetic algorithms package 32,961 172 9 62,727

freetts speech synthesis system 42,993 167 19 62,821

gruntspud graphical CVS client 80,138 378 10 63,847

jedit graphical text editor 144,496 427 1 62,910

columba graphical email client 149,044 1,170 35 53,689

jfreechart chart drawing library 193,396 707 6 62,885

Total 643,028 3,021 80 368,879

Table 6.12: Summary of information about our benchmarks. Applications are sortedby the number of lines of code in column 3.

synthetically created test suites or SpecJVM benchmarks, most of which avoid reflec-

tion altogether.

A summary of information about the applications is provided in Table 6.12. Notice

that the traditional lines of code size metric is somewhat misleading in the case of

applications that rely on large libraries. Many of these benchmarks depend of massive

libraries, so, while the application code may be small, the full size of the application

executed at runtime is quite large. The last column of the table in Table 6.12 lists

the number of classes available by the time each application is deployed, including

those in the JDK.

We ran all of our experiments in this section on an AMD Opteron 150 machine

equipped with 4GB or memory running Linux. JDK version 1.4.2_08 was used. All of

the running times for our preliminary implementation were in tens of minutes, which,

although a little high, is acceptable for programs of this size. Creating subtype

information for use with cast-based analysis took well under a minute.

6.3.2 Evaluation Approach

We have implemented five different variations of our algorithms: None, Local,

Points-to, Casts, and Sound and applied them to the benchmarks described above.

None is the base version that performs no reflection resolution; Local performs a sim-

ple local analysis, as described in Section 6.3.3. Points-to and Casts are described


None Local Points-to Casts Sound

Benchmark T T FR UR T FR PR UR T FR PR UR T FR URjgap 27 27 19 8 28 20 1 7 28 20 4 4 89 85 4freetts 30 30 21 9 30 21 0 9 34 25 4 5 81 75 6gruntspud 139 139 112 27 142 115 5 22 232 191 19 22 220 208 12jedit 156 156 137 19 161 142 3 16 159 141 11 7 210 197 12columba 104 105 89 16 105 89 2 14 118 101 10 7 173 167 6jfreechart 104 104 91 13 104 91 1 12 149 124 10 15 169 165 4

Table 6.13: Results of resolving Class.forName calls for different analysis versions.

in Sections 5.2 and 5.2.1, respectively.

Sound version is augmented with a user-provided specification to make the answer

conservative. We should point out that only the Sound version provides results that

are fully sound: None essentially assumes that reflective calls have no targets. Local

only handles reflective calls that can be fully resolved within a single method. Points-

to and Casts only provide targets for reflective calls for which either string or cast

information constraining the possible targets is available and unsoundly assumes that

the rest of the calls have no targets.

Table 6.13 summarizes the results of resolving Class.forName using all five analysis

versions. Class.forName calls represent by far the most common kind of reflective

operations and we focus on them in our experimental evaluation. To reiterate the

definitions in Section 5.2, we distinguish between:

• fully resolved calls to Class.forName for which all potential targets are class name

constants,

• partially resolved calls, which have at least one class name string constant prop-

agating to them, and

• unresolved calls, which have no class name string constants propagating to them,

only non-constant external sources requiring a specification.

The columns subdivide the total number of calls (T) into fully resolved calls (FR),

partially resolved (PR), and unresolved (UR) calls. In the case of Local analysis,


there are no partially resolved calls — calls are either fully resolved to constant

strings or unresolved. Similarly, in the case of Sound analysis, all calls are either

fully resolved or unresolved, as further explained in Section 6.3.5.

6.3.3 Local Analysis for Reflection Resolution (Local)

To provide a baseline for comparison, we implemented a local intra-method analysis

that identifies string constants passed to Class.forName. This analysis catches only

those reflective calls that can be resolved completely within a single method. Because

this technique does not use interprocedural points-to results, it cannot be used for

identification of specification points. Furthermore, because for method invocations

and field accesses the names of the method or field are typically not locally defined

constants, we do not perform resolution of method calls and field accesses in Local.

A significant percentage of Class.forName calls can be fully resolved by local anal-

ysis, as demonstrated by the numbers in column 4, Table 6.13. This is partly due to

the fact that it is quite common to call Class.forName with a constant string parameter

for the side-effects of the call, because doing so invokes the class constructor. Another

common idiom contributing the number of calls resolved by local analysis is T.class,

which is converted to a call to Class.forName and is always statically resolved.

6.3.4 Points-to Information for Reflection Resolution

(Points-to)

Points-to information is used to find targets of reflective calls to Class.forName,

Class.newInstance, Method.invoke, etc. As can be seen from Table 6.13, for all of the

benchmarks, Points-to information results in more resolved Class.forName calls com-

pared to Local.

Specification Points

Quite frequently, some sort of specification is required for reflective calls to be fully

resolved. Points-to information allows us to provide the user with a list of specification


1. private static Object newInstance(String className,2. ClassLoader classLoader) throws ConfigurationError {3. try {4. Class spiClass;5. if (classLoader == null) {6. spiClass = Class.forName(className);7. }8. ...9. return spiClass.newInstance();

10. } catch (...)11. ...12. }

Figure 6.1: Reflection resolution using points-to results injavax.xml.transform.FactoryFinder in the JDK.

points where inputs needs to be specified for a conservative answer to be obtained.

Among the specification points we have encountered in our experiments, calls to

System.getProperty to retrieve a system variable and calls to BufferedReader.readLine

to read a line from a file are quite common. Below we provide a typical example of

providing a specification and the importance of pointer analysis.

Example 14 This example describes resolving reflective targets of a call to

Class.newInstance in javax.xml.transform.FactoryFinder in the JDK in order to illus-

trate the power and limitation of using points-to information. Class FactoryFinder

has a method Class.newInstance shown in Figure 6.1. The call to Class.newInstance

occurs on line 9. However, the exact class instantiated at runtime depends on the

className parameter, which is passed into this function. This function is invoked

from a variety of places with the className parameter being read from initialization

properties files, the console, etc. In only one case, when Class.newInstance is called

from another function find located in another file, is the className parameter a string

constant.

This example makes the power of using points-to information apparent — the

Class.newInstance target corresponding to the string constant is often difficult to find

by just looking at the code. The relevant string constant was passed down through

several levels of method calls located in a different file; it took us more that five


minutes of exploration with a powerful code browsing tool to find this case in the

source. Resolving this Class.newInstance call also requires the user to provide input for

four specification points: along with a constant class name, our analysis identifies two

specification points, which correspond to file reads, one access of system properties,

and another read from a hash table. 2

In most cases, the majority of calls to Class.forName are fully resolved. However,

a small number of unresolved calls are potentially responsible for a large number of

specification points the user has to provide. For Points-to, the average number of

specification points per invocation site ranges from 3 for freetts to 9 for gruntspud.

However, for jedit, the average number of specification points is 422. Specification

points computed by the pointer analysis-based approach can be thought of as “hints”

to the user as to where provide specification.

In most cases, the user is likely to provide specification at program input points

where he knows what the input strings may be. This is because at a reflective call

it may be difficult to tell what all the constant class names that flow to it may be,

as illustrated by Example 4. Generally, however, the user has a choice. For prob-

lematic reflective calls like those in jedit that produce a high number of specification

points, a better strategy for the user may be to provide reflective specifications at the

Class.forName calls themselves instead of laboriously going through all the specification

points.

6.3.5 Casts for Reflection Resolution (Casts)

Type casts often provide a good first static approximation to what objects can be

created at a given reflective creation site. There is a pretty significant increase in the

number of Class.forName calls reported in Table 6.13 in a few cases, including 93 newly

discovered Class.forName calls in gruntspud that appear due to a bigger call graph when

reflective calls are resolved. In all cases, the majority of Class.forName calls have their

targets at least partially resolved. In fact, as many as 95% of calls are resolved in the

case of jedit.

As our experience with the Java reflection APIs would suggest, most


Class.newInstance calls are post-dominated by a cast operation, often located within

only a few lines of code of the Class.newInstance call. However, in our experiments,

we have identified a number of Class.newInstance call sites, which were not dominated

by a cast of any sort and therefore the return result of Class.newInstance could not be

constrained in any way. As it turns out, most of these unconstrained Class.newInstance

call sites are located in the JDK and sun.∗ sources, Apache libraries, etc. Very few

were found in application code.

The high number of unresolved calls in the JDK is due to the fact that reflection

use in libraries tends to be highly generic and it is common to have “Class.newInstance

wrappers” — methods that accept a class name as a string and return an object of

that class, which is later cast to an appropriate type in the caller method. Since

we rely on intraprocedural post-dominance, resolving these calls is beyond our scope.

However, since such “wrapper” methods are typically called from multiple invocation

sites and different sites can resolve to different types, it is unlikely that a precise

approximation of the object type returned by Class.newInstance is possible in these

cases at all.

Precision of Cast Information

Many reflective object creation sites are located in the JDK itself and are present

in all applications we have analyzed. For example, method lookup in package

sun.nio.cs.AbstractCharsetProvider reflectively creates a subclass of Charset and there

are 53 different character sets defined in the system. In this case, the answer is precise

because all of these charsets can conceivably be used depending on the application

execution environment. In many cases, the cast approach is able to uniquely pinpoint

the target of Class.newInstance calls based on cast information. For example, there is

only one subclass of class sun.awt.shell.ShellFolderManager available to gruntspud, so,

in order for the cast to succeed, it must be instantiated.

In general, however, the cast-based approach provides an imprecise upper bound

on the call graph that needs to be analyzed. Because the results of Class.newInstance

are occasionally cast to very wide types, such as java.lang.Cloneable, many potential


subclasses can be instantiated at the Class.newInstance call site. The cast-based ap-

proach is likely to yield more precise results on applications that use Java generics,

because those applications tend to use more narrow types when performing type casts.

6.3.6 Achieving a Sound Call Graph Approximation (Sound)

Providing a specification for unresolved reflective calls allows us to achieve a sound

approximation of the call graph. In order to estimate the amount of effort required

to come up with a specification for unresolved reflective calls, we decided to start

with Points-to and add a reflection specification. Because providing a specification

allows us to discover more of the call graph, two or three rounds of specification were

required in practice as new portions of the program became available. In practice, we

would start without a specification and examine all unresolved calls and specification

points corresponding to them. Then we would come up with a specification and feed

it back to the call graph construction algorithm until the process converges.

Coming up with a specification is a difficult and error-prone task that requires

looking at a large amount of source code. It took us about fifteen hours to incremen-

tally devise an appropriate specification and ensure its completeness by rerunning the

call graph construction algorithm. After providing a reflection specification stringing

with Points-to, we then estimate how much of the user-provided specification can

be avoided if we were to rely on type casts instead.

6.3.7 Effect of Reflection Resolution on Call Graph Size

Table 6.14 compares the number of classes and methods across different analysis

versions. Local analysis does not have any significant effect on the number of methods

or classes in the call graph, even though most of the calls to Class.forName can be

resolved with local analysis. This is due to the fact that the vast majority of these

calls are due to the use of the T.class idiom, which typically refer to classes that

are already within the call graph. While these trivial calls are easy to resolve, it is

the analysis of the other “hard” calls with a lot of potential targets that leads to a

substantial increase in the call graph size.


ClassesProgram None Local Points-to Casts Sound

jgap 264 264 268 276 1,569 5.94freetts 309 309 309 351 1,415 4.58gruntspud 1,258 1,258 1,275 2,442 2,784 2.21jedit 1,660 1,661 1,726 1,762 2,754 1.66columba 961 962 966 1,151 2,339 2.43jfreechart 884 881 886 1,560 2,340 2.65

MethodsBenchmark None Local Points-to Casts Sound

jgap 1,013 1,014 1,038 1,075 6,676 6.58freetts 1,357 1,358 1,358 1,545 5,499 4.05gruntspud 7,321 7,321 7,448 14,164 14,368 1.96jedit 11,230 11,231 11,523 11,503 16,003 1.43columba 5,636 5,642 5,652 6,199 12,001 2.13jfreechart 5,374 5,374 5,392 8,375 12,111 2.25

Table 6.14: Number of classes and methods in the call graph for different analysisversions.

Using Points-to increases the number of classes and methods in the call graph

only moderately. The biggest increase in the number of methods occurs for jedit

(293 methods). Using Casts leads to significantly bigger call graphs, especially for

gruntspud, where the increase in the number of methods compared to None is almost

two-fold.

The most noticeable increase in call graph size is observed in version Sound.

Compared to None, the average increase in the number of classes is 3.2 times the

original and the average increase for the number of methods is 3 times the original.

The biggest increase in the number of methods occurs in gruntspud, with over 7,000

extra methods added to the graph.

Table 6.14 also demonstrates that the lines of code metric is not always indicative

of the size of the final call graph — programs are listed in the increasing order of line

counts, yet, jedit and gruntspud are clearly the biggest benchmarks if we consider the

method count. This can be attributed to the use of large libraries that ship with the


application in binary form as well as considering a much larger portion of the JDK

in version Sound compared to version None.

6.4 Results from Machine Learning BDD Variable

Orders

This section presents some experimental results from using the machine learning

algorithm in Section 4.3 to discover variable orderings for various BDD program

analyses.

6.4.1 Methodology

To evaluate the effectiveness of our technique, we used the active learning algorithm

to discover variable orders for a variety of analyses. We then evaluated the quality of

the resulting orders.

Name Description Rules Relations Domains Number of ordersj pa Java context-insensitive 8 20 12 2.8× 1010

j pacs Java context-sensitive 12 25 14 2.3× 1014

j paos Java object-sensitive 9 20 15 5.3× 1015

sqlinject SQL injection query 20 19 17 3.9× 1018

c pa C context-insensitive 222 85 13 5.3× 1011

c pacs C context-sensitive 249 112 14 1.0× 1013

Table 6.15: Information about the analyses that we used to evaluate our BDD orderfinding algorithm. The four columns of numbers are the number of rules, relations,and domains in the input Datalog file, and the number of possible domain orders.

Table 6.15 shows the list of analyses that we used to evaluate the effectiveness of

our algorithm. Many analyses written for bddbddb are simple and execute quickly, so

we focused on the ones we have found to be the most time-consuming, requiring the

computation of a fixed-point solution. These analyses are the ones that can benefit

the most from improving the variable order.

Although some of the analyses may appear from their descriptions to compute


similar information, the analyses listed have vastly different performance character-

istics. The extra context information in the context-sensitive analyses completely

dwarfs the rest of the analysis as compared to their context-insensitive counterparts,

leading to a very different performance profile. Likewise, the C analyses are very

different from the Java analyses. The C analyses are far more complex, having to

deal with internal pointers, pointer arithmetic, pointers to stack variables, etc.; this

extra complexity is evident by the fact that the C analyses have 20 times as many

Datalog rules as the Java versions.

The context-sensitive C pointer analysis gives our algorithm the most difficulty

out of any of the analyses. This is because the huge number of rules and relations

lead to an extremely large search space. Good variable orders within the space are

rare, and optimal orders for one rule often conflict with those of other rules.

All experiments were performed on an AMD Opteron 150 with 4 GB of RAM

running RHEL v3. bddbddb uses the JavaBDD library[248], which is based on the

BuDDy BDD library[153]. For all experiments, we used an initial node table size of

10M and a cache size of 2M. We used our hand-tuned orders as the order to generate

the input relations for learning. The initial relations for the Java analyses were

generated by the Joeq compiler infrastructure[247]; the C analyses used the SUIF

compiler[258]. We trained the Java analyses using the input program joeq; for the

C analyses, we trained on enscript.

We used the following parameter values:

• minimum run time for an operation to be considered: 100ms/10ms2

• minimum number of orders to evaluate per operation: 10

• proportion of high performance classes to add to candidate set: 1/2

• number of candidate orders for evaluation: 500

• location of centroid (see Section 4.3.7): (0.5,0.5,1)

• uncertainty score for deciding if a trial should be performed: 0.25

• delay past best time before killing a trial: 10s

2For the C analyses, we used a value of 10ms. For all others, we used 100ms.


These parameters values are not critical to the results of the algorithm. They

follow from intuitive reasoning and we were able to set them easily through mini-

mal empirical evaluation. For instance, the minimum number of orders to evaluate

per operation dictates that we make predictions only after a sufficient sample size.

Further, we’ve found the algorithm to be robust to variations in these parameters.

Both of the C analyses have a large number of rules that execute many times very

quickly. We found that with the minimum time threshold at 100ms, too few rules

were being learned, leading to poor performance because the learning algorithm did

not have any information about the missing rules. For these two analyses, we lowered

the threshold to 10ms so that more operations would be evaluated by the learning

algorithm.

6.4.2 Results

Hand- Active Unique LearningName Random Sifting tuned Learning Eps Trials Orders timej pa 379s 107s 11s 9s 132 636 97 38mj pacs ∞ ∞ 211s 209s 1006 2554 1294 4h 14mj paos ∞ ∞ 53s 51s 160 1301 819 1h 57msqlinject ∞ ∞ 39s 33s 213 1450 1101 2h 15mc pa 30s 1570s 19s 4s 238 1901 1714 3h 7mc pacs ∞ ∞ 220s 65s 516 2784 2601 6h 47m

Table 6.16: The results of our learning algorithm. The first four columns of numberscompare the speed of a random order, an order generated with a sifting algorithm,our best hand-tuned order, and the order output by the algorithm. ∞ means thatthe analysis did not complete because it ran out of memory. The next four columnsgive statistics on the performance of the learning algorithm.

Table 6.16 contains the results of our learning algorithm. We compared the exe-

cution times to those of a randomly-generated order, an order generated by a sifting

algorithm, and to our best known, hand-tuned variable order. The random order

was very poor and could only complete in two of the cases. The sifting algorithm, a

traditional dynamic BDD variable order optimization technique[205], also performed

poorly.


Namej pa j pacs j paos

Hand-tuned Learned Hand-tuned Learned Hand-tuned Learnedumldot 19s 13s 216s 214s 78s 73sjgraph 17s 12s 312s 300s 48s 45sjxplorer 38s 32s 581s 574s 26s 25sazureus 5s 4s 48s 47s 9s 8sl2jserver 9s 7s 148s 144s 35s 32ssshdaemon 5s 4s 41s 39s 9s 8s

Table 6.17: A comparison of the run times of our hand-tuned and generated ordersfor the j pa, j pacs and j paos analyses on several different input programs.

The hand-tuned variable orders were the best orders we knew about before running

our learning algorithm. The fact that our algorithm was in all cases able to find an

order that was competitive with or soundly beat our “best” hand-tuned variable order

without any kind of seeding or user direction is heartening. This seems to indicate

that our algorithm is not getting stuck in local minima and is effective in finding a

good order.

In the cases of j pa and j pacs, we had actually run a mostly-exhaustive search

of all possible variable orders to obtain our hand-tuned orders. The searches took

many days to complete. When our learning algorithm did not return the order we

had found by exhaustive search, we were initially dismayed. However, after we tried

the orders from the learning algorithm we found that one of them was actually faster

than the supposed “best”! Investigating more closely, we realized that we had run

the exhaustive search on an older machine with a smaller operation cache size and

an older version of the BDD library. The combination of a newer machine, a newer

version of the BDD library, and the increase in operation cache size combined to

change the best order for the analysis.

Table 6.17 shows the generalization of the generated orders over other input pro-

grams. The algorithm proves to generalize well onto other input programs as the

generated orders show performance gains in all instances.

Chapter 7

Related Work

Research (or at least good research) does not occur in a vacuum. To achieve some-

thing truly great, we need others to challenge and inspire us. Many of the ideas in

the dissertation were inspired by related work by other researchers. This chapter

compares and contrasts the material presented in this dissertation with related work.

It is split into six sections:

Section 7.1: Pointer Analysis

Section 7.2: BDDs for Program Analysis

Section 7.3: High-level Languages for Program Analysis

Section 7.4: Optimizing Datalog

Section 7.5: BDD Optimizations

Section 7.6: Applications of Program Analysis

Some work conceptually falls into more than one of these areas. In that case,

we have tried to choose the most logical section with respect to the aspects we are

comparing to.

144

CHAPTER 7. RELATED WORK 145

7.1 Pointer Analysis

Chapter 3 described a scalable cloning-based points-to analysis that is context-

sensitive, field-sensitive, inclusion-based and implemented using BDDs. We compare

to related work in three areas of pointer analysis: scalable pointer analysis (7.1.1),

context-sensitive pointer analysis (7.1.2), and BDD-based pointer analysis (7.1.3).

Additional background on pointer analysis can also be found in Section 2.2.

7.1.1 Scalable Pointer Analysis

Most of the scalable algorithms proposed are context-insensitive and flow-insensitive.

The first scalable pointer analysis proposed was a unification-based algorithm due

to Steensgaard[227]. Das et al. extended the unification-based approach to include

“one-level-flow”[75] and one level of context sensitivity[77]. Subsequently, a number

of inclusion-based algorithms have been shown to scale to large programs[29, 120,

151, 252].

A number of context-sensitive but flow-insensitive analyses have been developed

recently[88, 91]. The C pointer analysis due to Fahndrich et al[88] has been demon-

strated to work on a 200K-line gcc program. They achieve moderate scalability

by sacrificing precision on other dimensions. Their algorithm is unification-based

and field-independent, meaning that fields in a structure are modeled as having the

same location. Their context-sensitive analysis discovers targets of function pointers

on-the-fly. Our algorithm first computes the call graph using a context-insensitive

pointer alias analysis; there are significantly more indirect calls in Java programs,

the target of our technique, due to virtual method invocations. Their algorithm uses

CFL-reachability queries to implement context sensitivity[202]. Instead of computing

context-sensitive solutions on demand, we compute all the context-sensitive results

and represent them in a form convenient for further analysis.


7.1.2 Context-sensitive Pointer Analysis

Some of the earlier attempts of context-sensitive analysis are flow-sensitive[86, 146,

255, 259]. Our analysis is similar to the work by Emami et al. in that they also com-

pute context-sensitive points-to results directly for all the different contexts. Their

analysis is flow-sensitive; ours uses flow sensitivity only in summarizing each method

intraprocedurally. While our technique treats all members of a strongly connected

component in a call graph as one unit, their technique only ignores subsequent in-

vocations in recursive cycles. On the other hand, their technique has only been

demonstrated to work for programs under 3,000 lines.

As discussed in Section 2.5.2, using summaries is another common approach to

context sensitivity. It is difficult to compute a compact summary if a fully flow-

sensitive result is desired. One solution is to use the concept of partial transfer

functions, which create summaries for observed calling contexts[257, 259]. The same

summary can be reused by multiple contexts that share the same relevant alias pat-

terns. This technique has been shown to handle C programs up to 20,000 lines.

One solution is to allow only weak updates[255]; that is, a write to a variable only

adds a value to the contents of the variable without removing the previously held

value. This greatly reduces the power of a flow-sensitive analysis. This approach

has been used to handle programs up to 70,000 lines of code. However, on larger

programs the representation still becomes too large to deal with. Because the goal of

the prior work was escape analysis, it was not necessary to maintain precise points-to

relations for locations that escape, so the algorithm achieved scalability by collapsing

escaped nodes.

Lackwit is a tool for analyzing C programs using type inferencing[183]. It supports

context sensitivity via a relation called compatibility. Compatibility is an undirected

relation that is induced by the instantiation of polymorphic constraints. It has been

shown to scale to a hundred-thousand-line C program. Ajax includes a static analysis

engine for Java based on type inference with polymorphic recursion[182]. It provides

context-sensitive analysis of sizable Java programs, such as javac.

Atkinson and Griswold investigated the importance of controlling the time and


space complexity when using context sensitivity[15]. They used a user-specified con-

text sensitivity level, resulting in a graph where contexts are merged from a given

depth down.

In Agesen’s Cartesian product algorithm[3], each contour summarizes the effect

of a particular concrete type; if the receiver object of a method invocation can be

of multiple types, then its effect is summarized by the combination of all the indi-

vidual contours representing all the types held by the object. Chambers et al[104]

experimented with different degrees of context-sensitivity based on the concept of

k-CFA[222].

7.1.3 BDD-based Pointer Analysis

Both Zhu and Berndl et al. used BDDs to implement context-insensitive inclusion-

based points-to analysis[29, 270]. Zhu extended his technique to support a summary-

based context sensitivity[271], whereas we developed a cloning-based context-sensitive

pointer analysis algorithm that relies heavily on the data sharing inherent in

BDDs[253]. Avots et al. extended our algorithm to support C programs with pointer

arithmetic[16].

Zhu proposed a summary-based context-sensitive points-to analysis for C pro-

grams, and reported preliminary experimental results on C programs with less than

5,000 lines[270]. Berndl et al. showed that BDDs can be used to compute context-

insensitive inclusion-based points-to results for large Java programs efficiently. Zhu

and Calman describe a cloning-based context-sensitive analysis for C pointers, as-

suming that only the safe C subset is used. The largest program reported in their

experiment has about 25,000 lines and 3×108 contexts[271]. These analyses use BDD

libraries directly, without additional abstraction to ease implementation.

7.2 BDDs for Program Analysis

BDDs have been recently become quite popular in the program analysis community.

We compare our work with related work in using BDDs for model checking (7.2.1),

predicate abstraction (7.2.2), and BDDs as a data representation (7.2.3).


7.2.1 BDDs for Model Checking

BDDs originated within the hardware model-checking field, and have been used for

software model-checking as well. In model checking, BDDs are used to encode sets of

system states and the translation relation. Burch et al. were the first to show model

checking could scale to large numbers of states[45]. Cadence SMV[166], NuSMV[64],

and CMU SMV[165] are all software model checkers that use BDDs to encode the set

of reached states.

Darga et al. present some novel techniques for checking the soundness of a type

system using a software model checker[74]. It works by generating every type-correct

program state within a finite bound, executing the program one step forward, and

then checking if the resulting program state is also type-correct. It represents the

states using BDDs.

7.2.2 BDDs for Predicate Abstraction

BDDs are useful in predicate abstraction for the verification of systems with large or

infinite state spaces. The first use of BDDs for predicate abstraction was by Das et

al[80, 78, 79]. The SLAM toolkit contains a tool called C2BP, which automatically

constructs a predicate abstraction of a C program using BDDs[21]. Bebop is a sym-

bolic model checker for boolean programs that uses BDDs to represent the implicit

states[22]. It has been used to validate critical safety properties of device drivers[23].

Lahiri et al. use a symbolic decision procedure for predicate abstraction[144]. They

provide two implementations: one based on BDDs and the other based on SAT solvers.

7.2.3 BDDs as a Data Representation

Jedd is a Java language extension that provides a relational algebra abstraction over

BDDs[150, 152]. They introduce relations as a first-class Java data type. Their

treatment of domain assignment as a constraint problem is similar to ours; they use

a SAT solver to find a legal domain assignment. They do not attempt to order the

constraints based on importance.


Sittampalam, de Moor, and Larsen formulate program analyses using conditions

on control flow paths[224]. These conditions contain free metavariables corresponding

to program elements (such as variables and constants). They use BDDs to efficiently

represent and search the large space of possible instantiations.

Manevich et al. use BDDs to represent sets for static analysis[161]. BDDs have

also been used in data representations for shape analysis[161, 264], class analyses of

object-oriented programs[30] and, in particular, pointer analysis[29, 253, 270, 271].

7.3 High-level Languages for Program Analysis

The idea of using high-level languages for program analysis is almost as old as pro-

gram analysis itself. We discuss related work in the specific areas of constraint

languages (7.3.1), user-specified program queries (7.3.2), and program analysis with

databases (7.3.3).

7.3.1 Constraint Languages

Many other systems have attempted to provide a uniform framework in which to

implement program analysis, including set- and type-constraints[9], graph rewrite

systems[14], and µ-calculus[197], which uses non-binary decision diagrams. The

Toupie system uses non-binary decision diagrams to implement the µ-calculus; it

is a general constraint system and incorporates little optimization of the constraint

resolution process. Horwitz and Teitelbaum store the a program representation in

a relational database for the purpose of allowing the user to ask queries[128]. The

information they store is at the abstract syntax tree level, rather than a level suitable

for program analysis.

Kodumal discusses a constraint-based toolkit for constructing static analyses

called Banshee[142]. It takes advantage of scalable algorithms for CFL-reachability

and set constraints[199]. Like bddbddb, users can specify static analyses in a declar-

ative language based on set constraints. Their formulation of context sensitivity is

based on matching call-return pairs rather than cloning. In bddbddb, one can use


cloning to implement context sensitivity and rely on the efficiency of the BDD data

structure to collapse similar contexts.

7.3.2 User-specified Program Queries

Some early tools allowed users to search source code in a semantically meaningful

way. SCRUPLE is a system for semantic search of source code[187]. tawk can match

patterns in a program’s AST[102]. Like awk, an action can be associated with a

pattern for automated transformations.

Other systems allow the user to specify program analyses. Metal[113] and SLIC[24]

both define state machines with respect to variables. These machines are used to con-

figure a static analysis that searches the program for situations where error transitions

can occur. Metal supports finite state machines with flexible event definitions and

limited ability to handle pointers. The state transitions can execute arbitrary C code.

SLIC machines use a model that is capable of counting but cannot exploit pointer

information.

7.3.3 Program Analysis with Databases

The use of Datalog and other logic programming languages has previously been pro-

posed for describing program analyses[81, 200, 201, 238]. Ullman first suggested for-

mulating data-flow analysis as database queries[238]. Reps used a deductive database

for demand-driven interprocedural data-flow analysis[200, 201]. However, previous

results indicated that the technique was too inefficient to be practical, compared

to conventional program analysis implementation techniques[81, 201]. Our bddbddb

system implements Datalog using BDDs[256] and has been used to compute context-

sensitive points-to results and other advanced analyses. Other examples of systems

that translate program analyses and queries written in logic programming languages

into implementations using BDDs include Toupie[69] and CrocoPat[31]. CrocoPat,

also based on BDDs, allows users to query program relations using a language based

on predicate calculus[31].

OMEGA is a tool that parses source code into a database, then allows the user


to browse the data and construct different program views[155]. The C Information

Abstraction (CIA) system is another proposal to store source code in a database[60].

CIA used an incremental database construction to deal with small changes to the

source program. Neither of these systems were expressive enough to express transitive

closure.

JQuery is a hierarchical source code browsing tool for Java programs[164]. It

allows the user to write queries in a Prolog-like language. It uses the TyRuBa logic

programming engine to evaluate these queries[82]. Unfortunately, TyRuBa has several

performance problems when dealing with large programs[109]. CodeQuest is a source

code querying tool that uses Datalog[110, 111]. It uses a traditional RDBMS to store

source code facts, and a optimizing compiler from Datalog to SQL to evaluate queries.

Paul and Prakash present a proposal for querying source code using an algebraic

query language[188]. They distinguish between three different types of queries: global

structural queries, statement-level structural queries, and flow information queries.

Their proposed Source Code Algebra (SCA) system combines a high level of ab-

straction with the expressive power of relational algebra plus transitive closure and

sequence operations.

Jarzabek presents PQL, an SQL-like query language with special primitives for

code queries[134]. It is implemented on top of a Prolog system. ASTLog is a program

query language for inspecting abstract syntax trees[72]. Besson and Jensen describe a

framework that uses Datalog to specify a variety of class analyses for object oriented

programs[30].

The Java Tools Language (JTL) is a language for querying Java programs[66]. It

uses a simply-typed relational database for program representation, and a Datalog-

like syntax and semantics for queries. It supports queries formulated in first-order

predicate logic augmented with transitive closure.

pmd is a popular system for enforcing constraints on the AST of Java programs[68].

Constraints can be specified with either a visitor or an XPATH expression. Checkstyle

is another popular system that uses XPATH expressions[47].


7.4 Optimizing Datalog

There has been quite a bit of work in the database community on optimizing the

execution of Datalog programs. We discuss three areas of related work: Datalog im-

plementations (7.4.1), logic programming with BDDs (7.4.2), and Datalog evaluation

strategies (7.4.3).

7.4.1 Datalog Implementations

Liu and Stoller described a method for transforming Datalog rules into an efficient

implementation based on indexed and linked data structures[156]. They proved their

technique has “optimal” run time with respect to the fact that the combinations of

facts that lead to all hypotheses of a rule being simultaneously true are considered

exactly once. They did not present experimental results. Their formulation also

greatly simplified the complexity analysis of Datalog programs. However, their tech-

nique does not apply when using BDDs, as the cost of BDD operations does not

depend upon combinations of facts, but rather upon the number of nodes in the BDD

representation and the nature of the relations.

There has been work on transforming Datalog programs to reduce the amount of

work necessary to compute a solution. Magic sets is a general algorithm for rewriting

logical rules to cut down on the number of irrelevant facts generated[25]. This idea was

extended to add better support for certain kinds of recursion[180]. Sagiv presented

an algorithm for optimizing a Datalog program under uniform equivalence[212]. Zhou

and Sato present several optimization techniques for fast computation of fix-points

by avoiding redundant evaluation of subgoals[269].

Halevy et al. describe the query-tree, a data structure that is useful in the opti-

mization of Datalog programs[112]. The query-tree encodes all symbolic derivation

trees that satisfy some property.

Execution and optimization of Datalog programs has a long history, surveyed in

[194]. Recent work has included optimizing queries based on properties of the input

relations[147] and optimizing Datalog programs with negation[112]. In bddbddb, we

incorporate several known techniques, such as semi-naıve evaluation, translating them


to the specific implementation technology of BDDs, and incorporate optimizations

that are specific to BDDs.

7.4.2 Logic programming with BDDs

Hu et al. [129] described a system for compiling a high-level specification into BDDs

for verification. Their specification supports structured data, sequential assignment

and conditional statements. They noted the importance of interleaving BDD variables

that are functionally related and partition the transition relation.

Iwaihara et al. described a technique for using BDDs for logic programming[133].

They compared two different relation encodings: logarithmic encoding, which is the

encoding we use in this dissertation, and linear encoding, which encodes elements or

parts of elements as their own BDD variable. The Toupie system translates logic pro-

gramming queries into an implementation based on decision diagrams[69]. Crocopat

is a tool for relational computation that is used for structural analysis of software

systems[31]. Like bddbddb, they use BDDs to represent relations.

7.4.3 Datalog Evaluation Strategies

There has been much research on optimizing Datalog evaluation strategies; for ex-

ample, semi-naıve evaluation[20], bottom-up evaluation[54, 179, 237], top-down with

tabling[59, 230], the role of rule ordering in computing fixed-points[192], etc. We use

an evaluation strategy geared towards the peculiarities of the BDDs — for example,

to maximize cache locality, we iterate around inner loops first.

7.5 BDD Optimizations

The BDD is likely the most widely used non-traditional data structure. It appears in

a plethora of areas from formal verification to data compression. In this section we

discuss related work on optimizing BDDs. We start with describing the various BDD

libraries that are available (7.5.1), and then discuss various techniques for finding

BDD variable orders (7.5.2).


7.5.1 BDD Libraries

One of the reasons for the success of BDDs is that there are many efficient BDD pack-

ages available. CUDD stands for CU Decision Diagram[226]. It is the most widely-

used BDD package as it is very feature-rich; it supports BDDs, ZDDs, ADDs, as well

as many operations that are not available in other BDD packages. It is also very

well-documented. BuDDy is another popular BDD package[153]. BuDDy does not

have as many features as CUDD, but for our application it appears to be the fastest

BDD library, so we used it to collect the results in this paper. CAL is a BDD library

based on breadth-first manipulation[215]. Breadth-first algorithms do not seem to

perform as well as the traditional depth-first traversal on single-processor machines

and CAL is missing some essential operations like relational product. BDDNOW is

a parallel BDD package targeted at a network of workstations[172]. It was able to

show parallel speedup even on networked commodity PCs. It implements a subset

of the CUDD API and runs on the PVM package. ABCD is a BDD package where

the BDD nodes are stored in only 64 bits[32]. It does not support dynamic variable

reordering and the performance does not seem as good as CUDD or BuDDy. Jav-

aBDD is a Java front-end to many BDD packages, including BuDDy, CUDD, and

CAL[248]. It also includes a port of BuDDy written in 100% Java. We use JavaBDD

to interface to the BuDDy library. PBF is a parallel breadth-first BDD package that

is sensitive to memory locality[262]. We have not experimented with it yet. Other

work includes implementing BDDs for massively parallel SIMD architectures[49] and

vector processors[184]. Yang et al. published a comprehensive comparison of many

BDD packages when used for model checking[261].

7.5.2 BDD Variable Ordering

BDD variable order optimization has been an active area of research. Over the years,

researchers have proposed static heuristics, dynamic reordering techniques and, more

recently, search and sample techniques. Static approaches seek to develop heuristics

by which BDD orders can be statically synthesized by structural inspection of the


boolean function itself (typically a circuit description). Fujita and others have pro-

posed techniques that inspect properties such as variable depth and pair-wise variable

distance[94, 63]. While static techniques are computationally inexpensive to apply,

they do not produce high quality results for many applications. In our case, the prim-

itive properties that these methods inspect are not readily producible for our Datalog

programs. Moreover, these methods would fail to capture the dynamic properties of

our problem.

Dynamic reordering techniques apply variable order modifications to the BDD as

it is manipulated in real-time. Sifting, the canonical dynamic technique, continually

swaps adjacent BDD variables until a reasonable order is found[205]. Others have

developed more advanced modification algorithms, such as variable grouping[185], for

the sifting framework. While these methods have found many practical applications,

they are extremely expensive in both memory consumption and computation time

and often get stuck in local minima. In addition, these methods fail to provide an

intuitive decomposition of variable orders into dominant features. Further, in our

results, we have empirically seen that at least sifting performs poorly.

Recent work has focused on sampling-based search and learning processes. Bollig

improved on previous efforts to develop a simulated annealing solution[36, 167]. Other

researchers have also used used genetic algorithms[84] and scatter search[130] to find

good variable orders. While these approaches can generate large sets of variable orders

from which one could extract similarities, they fail to explicitly provide any intuitive

abstraction of the problem. Moreover, these algorithms are tailored for optimizing

one BDD as opposed to the dynamic, multi-BDD environment we must deal with.

However, their use static heuristics to bootstrap the search process and dynamic

reordering techniques to define sample neighborhoods could complement our learning

approach.

Machine Learning Variable Orders

Recently, Grumberg et al. have tackled the BDD variable order problem with ma-

chine learning[105]. Their framework is inherently different from ours, as they lack


the higher level notion of domains and, instead, seek to order individual BDD vari-

ables. Our search space is therefore smaller and we can search more effectively. Their

implementation also differs significantly; while they mention active selection of train-

ing data as a possibility, their results are based on the selection of relatively arbitrary

orders. bddbddb employs active learning to maximize the value of each trial run[101].

Also, they use low level BDD properties (e.g. variable connectedness) as features in

their machine learning algorithms, while we have chosen the pairwise ordering of el-

ements. Moreover, they evaluate the performance of an order based on the size of

the BDD it creates, while we have noted that smaller BDDs do not necessarily mean

faster BDD operations.

7.6 Applications of Program Analysis

In this section we compare with program analyses that are related to those presented

in Chapter 5. We compare to work in call graph discovery (7.6.1), analyzing reflec-

tion (7.6.2), and finding errors in programs (7.6.3).

7.6.1 Call Graph Discovery

A lot of effort has been spent of analyzing function pointers in C[86, 168, 170] as well

as virtual method calls in C++[7, 18, 51, 186] and Java[103, 104, 198, 229, 235].

Function Pointers in C

Emami et al. describe how a context-sensitive pointer analysis for C integrated with

call graph construction in the presence of function pointers[86]. Their approach in-

troduces the notion of call graph discovery when the call graph is unavailable in

advance.

Milanova et al. evaluate the precision of call graph construction in the presence of

function pointers using an inexpensive pointer analysis approach[266] and conclude

that it is sufficient for most cases[168, 170].


Virtual Calls in C++

Bacon et al. compare the “unique name”, RTA, and CHA virtual call resolution

approaches[18, 19]. They conclude that RTA is both fast and effective and able to

resolve 71% of virtual calls on average.

Aigner and Holzle investigate the effect that virtual call elimination using CHA

has on the runtime of large C++ programs and report a median 18% performance

improvement over the original programs[7]. The number of virtual function calls is

reduced by a median factor of five.

Virtual Calls in Java

Grove et al. present a parameterized algorithmic framework for call graph

construction[103, 104]. They empirically assess a multitude of call graph construction

algorithms by applying them to a suite of medium-sized programs written in Cecil

and Java. Their experience with Java programs suggests that the effect of using con-

text sensitivity for the task of call graph construction in Java yields only moderate

improvements.

Tip and Palsberg propose a propagation-based algorithm for call graph construc-

tion and investigate the design space between existing algorithms for call graph con-

struction such as 0-CFA and RTA, including RTA, CHA, and four new ones[235].

Sundaresan et al. go beyond the traditional RTA and CHA approaches in Java

and use type propagation for the purpose of obtaining a more precise call graph[229].

Their approach of using variable type analysis (VTA) is able to uniquely determine

the targets of potentially polymorphic call sites in 32% to 94% of the cases.

Agrawal et al. propose a demand-driven algorithm for call graph construction[4].

Their work is motivated by the need for just-in-time or dynamic compilation as well as

program analysis used as part of a software development environment. They demon-

strate that their demand-driven technique has the same accuracy as the corresponding

exhaustive technique. The reduction in the graph construction time depends upon

the ratio of the cardinality of the set of influencing nodes to the set of all nodes.


7.6.2 Analyzing Reflection

General treatments of reflection in Java are given in Forman and Forman[90] and

Gueheneuc et al[106].

The metadata and reflection community has a long line of research originating in

languages such as Scheme[233]. We only mention a few relevant projects here. The

closest static analysis project to ours we are aware of is the work by Braux and Noye on

applying partial evaluation to reflection resolution for the purpose of optimization[38].

Their paper describes extensions to a standard partial evaluator to offer reflection

support. The idea is to “compile away” reflective calls in Java programs, turning them

into regular operations on objects and methods, given constraints on the concrete

types of the object involved. The type constraints for performing specialization are

provided by hand.

Our static analysis can be thought of as a tool for inferring such constraints,

however, as our experimental results show, in many cases targets of reflective calls

cannot be uniquely determined and so the benefits of specialization to optimize pro-

gram execution may be limited. Braux and Noye present a description of how their

specialization approach may work on examples extracted from the JDK, but lacks

a comprehensive experimental evaluation. In related work for languages other than

Java, Ruf explores the use of partial evaluation as an optimization technique in the

context of CLOS[206].

Specifying reflective targets is explicitly addressed in Jax[234]. Jax is concerned

with reducing the size of Java applications in order to reduce download time; it reads

in the class files that constitute a Java application, and performs a whole-program

analysis to determine the components of the application that must be retained in

order to preserve program behavior. Clearly, information about the true call graph is

necessary to ensure that no relevant parts of the application are pruned away. Jax’s

approach to reflection is to employ user-provided specifications of reflective calls.

To assist the user with writing complete specification files, Jax relies on dynamic

instrumentation to discover the missing targets of reflective calls. Our analysis based

on points-to information can be thought of as a tool for determining where to insert

reflection specifications.


A recent paper by Hirzel, Diwan, and Hind addresses the issues of dynamic class

loading, native methods, and reflection in order to deal with the full complexity

of Java in the implementation of a common pointer analysis[125]. Their approach

involves converting the pointer analysis[12] into an online algorithm: they add con-

straints between analysis nodes as they are discovered at runtime. Newly generated

constraints cause re-computation and the results are propagated to analysis clients

such as a method inliner and a garbage collector at runtime. Their approach leverages

the class hierarchy analysis (CHA) to update the call graph. Our technique uses a

more precise pointer analysis-based approach to call graph construction.

7.6.3 Finding Program Errors

Much attention has been given recently to the topic of detecting errors in programs.

Penetration testing, which involves attempting to determine input values that exploit

holes in an application, is the current typical approach[13, 44, 217]. This approach

cannot guarantee that all vulnerabilities will be located. Penetration testers often

use “fuzzing” tools, which generate random input, to help locate vulnerabilities.

Others have attempted to locate errors statically; Chess and McGraw provide

an overview of these approaches[61]. These tools range from simple lexical anal-

ysis to sophisticated program analyses. Of the latter, most practical tools make

unsound assumptions regarding pointer aliasing. Intrinsa and Metal, which locate

errors in C and C++ programs, and WebSSARI, which finds errors in PHP code,

fall into this category[48, 113]. Other approaches, such as those based on type

qualifiers[135, 220, 240] may suffer from imprecision or a need for user annotations.

Our approach is novel in that it is a sound static analysis with a low false positive rate.

Using powerful pointer alias analysis allows us to avoid making unsound assumptions

without imprecision producing many false positives.

SATurn is a framework for finding bugs that uses a SAT solver[260]. Like our work,

they use an efficient off-the-shelf technology to solve a difficult problem in program

analysis. Their technique is not sound as it cannot handle loops; they unroll loops a

constant number of times.


EXE[50, 263] is a system for automatically finding bugs in system code using

symbolic execution. As values are observed through conditional statements and other

checks, EXE adds symbolic constraints to a constraint solver, which generates a

concrete test cases. Thus, EXE can automatically generate test cases that exhibit

bugs.

Chapter 8

Conclusions

As we go forward and rely on complex software systems for more and more aspects

of our lives, software reliability will be become more and more important. Program

analysis will become an even more important part of the software development pro-

cess, helping developers build more reliable software systems more quickly and with

fewer bugs and security defects.

The usefulness of a program analysis tool is largely determined by its accuracy

and ease of use. In this dissertation we presented bddbddb, a tool that gives the user

an easy way of specifying advanced program analyses. bddbddb greatly simplifies the

implementation of program analyses. The program is represented as relations in a

relational database — an easily understandable metaphor — and the inference rules

for the analysis are specified in Datalog, a concise and expressive logic programming

language. bddbddb automatically translates the Datalog specification into an efficient

implementation using BDDs. The resulting analyses are very efficient, even beating

hand-coded analyses that had been tuned for months. Owing to the power of the

BDD data structure, bddbddb can even solve analysis problems that were previously

intractable, such as context-sensitive inclusion-based pointer analysis for large pro-

grams. The formulation of context sensitivity is also extremely simple, and allows

the user to query for the set of contexts under which something is true. The results

of an analysis are easily accessible as database relations.

bddbddb attempts to address some of the challenges in building an effective static

161

CHAPTER 8. CONCLUSIONS 162

analysis tool. It makes it easier to develop efficient analyses. It provides the program

data in a relational database form convenient for analysis. It allows easy development

of domain-specific and application-specific program analyses. It makes it easier to

collaborate with others.

Perhaps the strongest evidence of the usefulness of using bddbddb, Datalog, and

BDDs for program analysis is the fact that others were able to pick it up quickly

and use it to develop sophisticated analyses of their own. Section 5.4 contains some

examples of the analyses that people have built using bddbddb. The feedback we

have received has been overwhelmingly positive, and people have commented that

they “can never go back” to the old way of implementing program analyses.

8.1 Future Work

We firmly believe that the general idea of implementing program analysis by trans-

lating from a high-level specification into an efficient implementation is the future of

program analysis. However, there are still many improvements that could be made

to bddbddb and the method of specifying program analyses to make it easier to use

and give better results.

8.1.1 Other Data Representations

BDDs have proven to be useful for program analysis problems where there are large

sets of redundant data, such as with pointer analysis. They do not fare as well in

solving problems with many small steps, such as a dataflow analysis. For bddbddb

to be a general program analysis tool, it should support different data representation

backends, such as bit vectors, hash-consing, union-find data structures, etc. It would

also be interesting to experiment with other types of decision diagrams, such as BEDs,

BMDs, or ZDDs.

Ideally, bddbddb would be able to choose an appropriate data representation based

on the nature of the analysis and the input data. For example, if we determine a

relation is transitive, reflective, and associative, bddbddb could automatically use a

CHAPTER 8. CONCLUSIONS 163

union-find data structure to represent that relation, leading to efficiency gains. Small

relations could automatically use a bit vector representation for efficiency.

8.1.2 Beyond Datalog

Datalog is a powerful and succinct language for specifying inference rules. However, it

is still rather unapproachable to the average programmer. Also, certain natural con-

structs, like control-flow, are a bit awkward to express in Datalog. We envision other

more natural techniques for programmers to specify program analyses and design

rules.

One example Martin et al. have pursued in this direction is PQL, which pro-

vides syntactic sugar that allows an important subset of analyses to be specified

intuitively[163]. Programmers need not learn the names or meanings of bddbddb re-

lations; they can simply write code patterns of interest in Java, and the patterns are

automatically translated into Datalog. With this approach, they were able to find

dozens of previously unknown security errors in a variety of large, widely-used open-

source applications relatively easily. We think there are many more opportunities for

easy, high-level specifications of program analysis that are easy and natural and also

return useful information quickly.

8.1.3 Integration in the Software Development Process

bddbddb is a standalone tool that is mostly tailored towards people who are already

familiar with compilers and program analysis. It would be an interesting experiment

to try to integrate bddbddb into a software development process. It would certainly

expose weaknesses in the tool and suggest new directions for improving the usability

and usefulness of bddbddb.

Bibliography

[1] Serge Abiteboul and Richard Hull. Data functions, Datalog and negation. In

SIGMOD ’88: Proceedings of the 1988 ACM SIGMOD International Conference

on Management of Data, pages 143–153, New York, NY, USA, 1988. ACM

Press.

[2] Foto Afrati, Stavros S. Cosmadakis, and Mihalis Yannakakis. On Datalog

vs. polynomial time (extended abstract). In Proceedings of the Tenth ACM

SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems,

pages 13–25, New York, NY, USA, 1991. ACM Press.

[3] Ole Agesen. The cartesian product algorithm: Simple and precise type inference

of parametric polymorphism. In ECOOP’95 - Object-Oriented Programming,

9th European Conference, pages 2–26, 1995.

[4] Gagan Agrawal, Jinqian Li, and Qi Su. Evaluating a demand driven technique

for call graph construction. In Computational Complexity, pages 29–45, 2002.

[5] Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Compilers, Principles, Tech-

niques and Tools. Addison-Wesley, Reading, MA, 1986.

[6] Alfred V. Aho and Jeffrey D. Ullman. Universality of data retrieval languages.

In Conference Record of the Sixth Annual ACM Symposium on Principles of

Programming Languages, pages 110–119, New York, NY, USA, 1979. ACM

Press.

164

BIBLIOGRAPHY 165

[7] Gerald Aigner and Urs Holzle. Eliminating virtual function calls in C++ pro-

grams. In ECOOP’96 - Object-Oriented Programming, 10th European Confer-

ence, pages 142–166. Springer-Verlag, 1996.

[8] Alexander Aiken. Introduction to set constraint-based program analysis. Sci-

ence of Computer Programming, 35(2-3):79–111, 1999.

[9] Alexander Aiken, Manuel Fahndrich, Jeffrey S. Foster, and Zhendong Su. A

toolkit for constructing type- and constraint-based program analyses. In Types

in Compilation, pages 78–96, 1998.

[10] Bowen Alpern, Dick Attanasio, John Barton, Michael Burke, Perry Cheng,

Jong-Deok Choi, Anthony Cocchi, Stephen Fink, David Grove, Michael Hind,

Susan Flynn Hummel, Derek Lieber, Vassily Litvinov, Ton Ngo, Mark Mergen,

Vivek Sarkar, Mauricio Serrano, Janice Shepherd, Stephen Smith, V. C. Sreed-

har, Harini Srinivasan, and John Whaley. The Jalapeno virtual machine. IBM

Systems Journal, Java Performance Issue, 39(1), 2000.

[11] Henrik Reif Andersen and Henrik Hulgaard. Boolean expression diagrams. In-

formation and Computation, 179(2):194–212, 2002.

[12] Lars Ole Andersen. Program analysis and specialization for the C programming

language. PhD thesis, DIKU, University of Copenhagen, May 1994.

[13] Brad Arkin, Scott Stender, and Gary McGraw. Software penetration testing.

Proceedings of the IEEE Symposium on Security and Privacy, 3(1):84–87, 2005.

[14] Uwe. Aßmann. How to uniformly specify program analysis and transformation

with graph rewrite systems. Lecture Notes in Computer Science, 1060:121–135,

April 1996.

[15] Darren C. Atkinson and William G. Griswold. The design of whole program

analysis tools. In 18th International Conference on Software Engineering, pages

16–27. IEEE Computer Society Press, 1996.

BIBLIOGRAPHY 166

[16] Dzintars Avots, Michael Dalton, V. Benjamin Livshits, and Monica S. Lam.

Improving software security with a C pointer analysis. In 27th International

Conference on Software Engineering (ICSE 2005), pages 332–341, 2005.

[17] David F. Bacon, Clement R. Attanasio, Han B. Lee, V. T. Rajan, and Stephen

Smith. Java without the coffee breaks: a nonintrusive multiprocessor garbage

collector. In Proceedings of the 2001 ACM SIGPLAN Conference on Program-

ming Language Design and Implementation (PLDI), pages 92–103, New York,

NY, USA, 2001. ACM Press.

[18] David F. Bacon and Peter F. Sweeney. Fast static analysis of C++ virtual

function calls. In Proceedings of the 1996 ACM SIGPLAN Conference on

Object-Oriented Programming Systems, Languages & Applications (OOPSLA

’96), pages 324–341, October 1996.

[19] David Francis Bacon. Fast and Effective Optimization of Statically Typed

Object-Oriented Languages. PhD thesis, University of California at Berkeley, 5,

1998.

[20] Isaac Balbin and Kotagiri Ramamohanarao. A generalization of the differen-

tial approach to recursive query optimization. Journal of Logic Programming,

4(3):259–262, September 1987.

[21] Thomas Ball, Rupak Majumdar, Todd D. Millstein, and Sriram K. Rajamani.

Automatic predicate abstraction of C programs. In Proceedings of the 2001

ACM SIGPLAN Conference on Programming Language Design and Implemen-

tation (PLDI), pages 203–213, June 2001.

[22] Thomas Ball and Sriram K. Rajamani. Bebop: A symbolic model checker

for boolean programs. In Proceedings of the 7th International SPIN Workshop

on SPIN Model Checking and Software Verification, pages 113–130. Springer-

Verlag, 2000.

[23] Thomas Ball and Sriram K. Rajamani. Automatically validating temporal

safety properties of interfaces. In SPIN ’01: Proceedings of the 8th International

BIBLIOGRAPHY 167

SPIN Workshop on Model Checking of Software, pages 103–122. Springer-Verlag

New York, Inc., 2001.

[24] Thomas Ball and Sriram K. Rajamani. SLIC: A specification language for inter-

face checking (of C). Technical Report MSR-TR-2001-21, Microsoft Research,

January 2002.

[25] Francois Bancilhon, David Maier, Yehoshua Sagiv, and Jeffrey D. Ullman.

Magic sets and other strange ways to implement logic programs (extended ab-

stract). In Proceedings of the Fifth ACM SIGACT-SIGMOD Symposium on

Principles of Database Systems, pages 1–15, 1986.

[26] John J. Barton and John Whaley. A real-time performance visualizer for Java.

Dr. Dobb’s Journal of Software Tools, 23(3):44, 46–48, 105, March 1998.

[27] Eric Bauer and Ron Kohavi. An empirical comparison of voting classification

algorithms: Bagging, boosting, and variants. Machine Learning, 36(1-2):105–

139, 1999.

[28] Michael Benedikt, Guozhu Dong, Leonid Libkin, and Limsoon Wong. Relational

expressive power of constraint query languages. J. ACM, 45(1):1–34, 1998.

[29] Marc Berndl, Ondrej Lhotak, Feng Qian, Laurie J. Hendren, and Navindra

Umanee. Points-to analysis using BDDs. In Proceedings of the ACM SIGPLAN

2003 Conference on Programming Language Design and Implementation 2003,

pages 103–114, June 2003.

[30] Frederic Besson and Thomas Jensen. Modular class analysis with Datalog. In

Static Analysis, 10th International Symposium, SAS 2003, pages 19–36, June

2003.

[31] Dirk Beyer, Andreas Noack, and Claus Lewerentz. Simple and efficient rela-

tional querying of software structures. In Proceedings of the 10th IEEE Working

Conference on Reverse Engineering, pages 216–225, November 2003.

BIBLIOGRAPHY 168

[32] Armin Biere. ABCD: an experimental BDD library. http://fmv.jku.at/

abcd/, 1998.

[33] Kevin Bierhoff. Alias analysis with bddbddb. Star Project Report, 17-754

Analysis of Software Artifacts, 2005.

[34] Bloor Research. CAST Tools Report, 1996.

[35] Rastislav Bodik and Sadun Anik. Path-sensitive value-flow analysis. In POPL

’98, Proceedings of the 25th ACM SIGPLAN-SIGACT Symposium on Principles

of Programming Languages, pages 237–251, 1998.

[36] Beate Bollig, Martin Lobbig, and Ingo Wegener. Simulated annealing to improve

variable orderings for OBDDs. In Proceedings of the International Workshop

on Logic Synthesis, May 1995.

[37] Beate Bollig and Ingo Wegener. Improving the variable ordering of OBDDs is

NP-complete. IEEE Transactions on Computers, 45(9):993–1002, September

1996.

[38] Mathias Braux and Jacques Noye. Towards partially evaluating reflection

in Java. In Proceedings of the ACM Workshop on Partial Evaluation and

Semantics-based Program Manipulation, pages 2–11, 1999.

[39] Leo Breiman. Bagging predictors. Machine Learning, 24(2):123–140, 1996.

[40] David Brumley and James Newsome. Alias analysis for assembly. Technical Re-

port CMU-CS-06-180, Carnegie Mellon University School of Computer Science,

2006.

[41] Randal E. Bryant. Graph-based algorithms for boolean function manipulation.

IEEE Transactions on Computers, 35(8):677–691, 1986.

[42] Randal E. Bryant. On the complexity of VLSI implementations and graph

representations of Boolean functions with application to integer multiplication.

IEEE Transactions on Computers, 40(2):205–213, 1991.

http://fmv.jku.at/abcd/

http://fmv.jku.at/abcd/

BIBLIOGRAPHY 169

[43] Randal E. Bryant and Yirng-An Chen. Verification of arithmetic circuits with

binary moment diagrams. In DAC ’95: Proceedings of the 32nd ACM/IEEE

conference on Design automation, pages 535–541, New York, NY, USA, 1995.

ACM Press.

[44] Brian Buege, Randy Layman, and Art Taylor. Hacking Exposed: J2EE

and Java: Developing Secure Applications with Java Technology. McGraw-

Hill/Osborne, 2002.

[45] J. R. Burch, E. M. Clarke, K. L. McMillan, D. L. Dill, and L. J. Hwang. Sym-

bolic Model Checking: 1020 States and Beyond. In Proceedings of the Fifth

Annual IEEE Symposium on Logic in Computer Science, pages 1–33, Washing-

ton, D.C., 1990. IEEE Computer Society Press.

[46] Michael G. Burke, Jong-Deok Choi, Stephen Fink, David Grove, Michael Hind,

Vivek Sarkar, Mauricio J. Serrano, V. C. Sreedhar, Harini Srinivasan, and John

Whaley. The Jalapeno dynamic optimizing compiler for Java. In Proceedings

of the ACM SIGPLAN ’99 Java Grande Conference, June 12–14, 1999.

[47] Oliver Burn. Checkstyle, 2003. http://checkstyle.sourceforge.net.

[48] William R. Bush, Jonathan D. Pincus, and David J. Sielaff. A static analyzer

for finding dynamic programming errors. Software - Practice and Experience

(SPE), 30:775–802, 2000.

[49] G. P. Cabodi, S. Gai, M. Rebaudengo, and M. Soza Reorda. A bdd package

for a massively parallel simd architectures. In ICVC93: IEEE International

Conference on VLSI and CAD, pages 332–335, November 1993.

[50] Cristian Cadar, Vijay Ganesh, Peter M. Pawlowski, David L. Dill, and Daw-

son R. Engler. EXE: automatically generating inputs of death. In CCS ’06:

Proceedings of the 13th ACM Conference on Computer and Communications

Security, pages 322–335, New York, NY, USA, 2006. ACM Press.

http://checkstyle.sourceforge.net

BIBLIOGRAPHY 170

[51] Brad Calder and Dirk Grunwald. Reducing indirect function call overhead in

C++ programs. In Conference Record of POPL’94: The 21st ACM SIGPLAN-

SIGACT Symposium on Principles of Programming Languages, pages 397–408,

Portland, Oregon, 1994.

[52] Michael Carbin, John Whaley, and Monica Lam. Learning effective variable

orderings for BDD-based program analysis. To be submitted for publication,

2006.

[53] Stefano Ceri, Georg Gottlob, and Letizia Tanca. What you always wanted to

know about Datalog (and never dared to ask). IEEE Transactions on Knowledge

and Data Engineering, 1(1):146–166, 1989.

[54] Stefano Ceri, Georg Gottlob, and Letizia Tanca. Logic programming and

databases. Springer-Verlag New York, Inc., 1990.

[55] CERT. CERT/CC statistics 1988–2006, vulnerabilities reported. http://www.

cert.org/stats/, 2006.

[56] Venkatesan T. Chakaravarthy. New results on the computability and com-

plexity of points–to analysis. In Conference Record of POPL 2003: The

30th SIGPLAN-SIGACT Symposium on Principles of Programming Languages,


[57] Ashok Chandra and David Harel. Horn clauses and generalizations. Journal of

Logic Programming, 2(1):1–15, 1985.

[58] Ramkrishna Chatterjee, Barbara G. Ryder, and William Landi. Relevant con-

text inference. In POPL ’99, Proceedings of the 26th ACM SIGPLAN-SIGACT

Symposium on Principles of Programming Languages, pages 133–146, 1999.

[59] Weidong Chen and David S. Warren. Tabled evaluation with delaying for gen-

eral logic programs. J. ACM, 43(1):20–74, 1996.

http://www.cert.org/stats/

http://www.cert.org/stats/

BIBLIOGRAPHY 171

[60] Yih-Farn Chen, Michael Y. Nishimoto, and C. V. Ramamoorthy. The C in-

formation abstraction system. IEEE Transactions on Software Engineering,

16(3):325–334, 1990.

[61] Brian Chess and Gary McGraw. Static analysis for security. Proceedings of the

IEEE Symposium on Security and Privacy, 2(6):76–79, 2004.

[62] Jong-Deok Choi, Manish Gupta, Mauricio J. Serrano, Vugranam C. Sreedhar,

and Samuel P. Midkiff. Escape analysis for Java. In Proceedings of the 1999

ACM SIGPLAN Conference on Object-Oriented Programming Systems, Lan-

guages & Applications (OOPSLA ’99), pages 1–19, November 1999.

[63] Pi-Yu Chung, Ibrahim N. Hajj, and Janak H. Patel. Efficient variable ordering

heuristics for shared ROBDD. In ISCAS, pages 1690–1693, 1993.

[64] Alessandro Cimatti, Edmund M. Clarke, Fausto Giunchiglia, and Marco Roveri.

NuSMV: A new symbolic model verifier. In CAV ’99: Proceedings of the 11th

International Conference on Computer Aided Verification, pages 495–499, Lon-

don, UK, 1999. Springer-Verlag.

[65] W F. Clocksin and Christopher S. Mellish. Programming in Prolog (2nd ed.).

Springer-Verlag New York, Inc., New York, NY, USA, 1984.

[66] Tal Cohen, Joseph (Yossi) Gil, and Itay Maman. JTL: the Java tools lan-

guage. In Proceedings of the 21th Annual ACM SIGPLAN Conference on

Object-Oriented Programming, Systems, Languages, and Applications, OOP-

SLA 2006, pages 89–108, New York, NY, USA, 2006. ACM Press.

[67] Brian Cole, Daniel Hakim, David Hovemeyer, Reuven Lazarus, William Pugh,

and Kristin Stephens. Improving your software using static analysis to find

bugs. In Proceedings of the 21th Annual ACM SIGPLAN Conference on Object-

Oriented Programming, Systems, Languages, and Applications, OOPSLA 2006,


[68] Tom Copeland. PMD Applied. Centennial Books, November 2005.

BIBLIOGRAPHY 172

[69] Marc-Michel Corsini, Kaninda Musumbu, Antoine Rauzy, and Baudouin Le

Charlier. Efficient bottom-up abstract interpretation of prolog by means of

constraint solving over symbolic finite domains. In PLILP ’93: Proceedings

of the 5th International Symposium on Programming Language Implementation

and Logic Programming, pages 75–91, 1993.

[70] Oliver Coudert and Jean Christophe Madre. A unified framework for the formal

verification of sequential circuits. In Proceedings of the IEEE International

Conference on Computer Aided Design, pages 126–129, November 1990.

[71] Patrick Cousot and Radhia Cousot. Abstract interpretation: a unified lattice

model for static analysis of programs by construction or approximation of fix-

points. In Conference Record of the Fourth ACM Symposium on Principles

of Programming Languages, pages 238–252, New York, NY, USA, 1977. ACM

Press.

[72] Roger F. Crew. ASTLOG: A language for examining abstract syntax trees. In

Proceedings of the USENIX Conference on Domain-Specific Languages (DSL-

97), pages 229–242, October 1997.

[73] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N. Wegman, and F. Ken-

neth Zadeck. Efficiently computing Static Single Assignment form and the

control dependence graph. ACM Transactions on Programming Languages and

Systems, 13(4):451–490, October 1991.

[74] Paul T. Darga, Michael Roberson, Melanie Agnew, and Chandrasekhar Boyap-

ati. Efficient software model checking of soundness of type systems. Submitted

for publication, 2007.

[75] Manuvir Das. Unification-based pointer analysis with directional assignments.

In Proceedings of the 2000 ACM SIGPLAN Conference on Programming Lan-

guage Design and Implementation (PLDI), pages 35–46, New York, NY, USA,

June 2000. ACM Press.

BIBLIOGRAPHY 173

[76] Manuvir Das, Sorin Lerner, and Mark Seigle. Esp: path-sensitive program ver-

ification in polynomial time. In Proceedings of the 2002 ACM SIGPLAN Con-

ference on Programming Language Design and Implementation (PLDI), pages

57–68, New York, NY, USA, 2002. ACM Press.

[77] Manuvir Das, Ben Liblit, Manuel Fahndrich, and Jakob Rehof. Estimating

the impact of scalable pointer analysis on optimization. In Static Analysis, 8th

International Symposium, SAS 2001, pages 260–278, July 2001.

[78] Satyaki Das and David L. Dill. Successive approximation of abstract transition

relations. In Proceedings of the Sixteenth Annual IEEE Symposium on Logic in

Computer Science, June 2001.

[79] Satyaki Das and David L. Dill. Counter-example based predicate discovery in

predicate abstraction. In Formal Methods in Computer-Aided Design. Springer-

Verlag, November 2002.

[80] Satyaki Das, David L. Dill, and Seungjoon Park. Experience with predicate

abstraction. In 11th International Conference on Computer-Aided Verification.

Springer-Verlag, July 1999.

[81] Steven Dawson, C. R. Ramakrishnan, and David S. Warren. Practical program

analysis using general purpose logic programming systems—a case study. In

Proceedings of the ACM SIGPLAN’96 Conference on Programming Language

Design and Implementation (PLDI), pages 117–126, May 1996.

[82] Kris de Volder. Type-Oriented Logic Meta Programming. PhD thesis, Vrije

Universiteit Brussel, Programming Technology Laboratory, June 1998.

[83] Jeffrey Dean, David Grove, and Craig Chambers. Optimization of object-

oriented programs using static class hierarchy analysis. In ECOOP’95 - Object-

Oriented Programming, 9th European Conference, pages 77–101, August 1995.

[84] Rolf Drechsler, Nicole Gockel, and Bernd Becker. Learning heuristics for OBDD

minimization by evolutionary algorithms. In PPSN IV: Proceedings of the 4th

BIBLIOGRAPHY 174

International Conference on Parallel Problem Solving from Nature, pages 730–

739, London, UK, 1996. Springer-Verlag.

[85] Thomas Eiter, Georg Gottlob, and Heikki Mannila. Disjunctive Datalog. ACM

Transactions on Database Systems, 22(3):364–418, 1997.

[86] Maryam Emami, Rakesh Ghiya, and Laurie J. Hendren. Context-sensitive inter-

procedural points-to analysis in the presence of function pointers. In Proceedings

of the ACM SIGPLAN’94 Conference on Programming Language Design and

Implementation (PLDI), pages 242–256, June 1994.

[87] Dawson Engler, Benjamin Chelf, Andy Chou, and Seth Hallem. Checking sys-

tem rules using system-specific, programmer-written compiler extensions. In

Proceedings of the Fourth Symposium on Operating Systems Design and Imple-

mentation, 2000.

[88] Manuel Fahndrich, Jakob Rehof, and Manuvir Das. Scalable context-sensitive

flow analysis using instantiation constraints. In Proceedings of the 2000 ACM

SIGPLAN Conference on Programming Language Design and Implementation

(PLDI), pages 253–263, June 2000.

[89] Cormac Flanagan and Shaz Qadeer. A type and effect system for atomicity. In

Proceedings of the ACM SIGPLAN 2003 Conference on Programming Language

Design and Implementation 2003, pages 338–349, New York, NY, USA, 2003.

ACM Press.

[90] Ira R. Forman and Nate Forman. Java Reflection in Action. Manning Publi-

cations, 2004.

[91] Jeffrey S. Foster, Manuel Fahndrich, and Alexander Aiken. Polymorphic versus

monomorphic flow-insensitive points-to analysis for C. In Static Analysis, 7th

International Symposium, SAS 2000, April 2000.

[92] S. J. Friedman and K. J. Supowit. Finding the optimal variable ordering for

binary decision diagrams. In DAC ’87: Proceedings of the 24th ACM/IEEE

BIBLIOGRAPHY 175

Conference on Design Automation, pages 348–356, New York, NY, USA, 1987.

ACM Press.

[93] Norbert Fuhr. Probabilistic Datalog — a logic for powerful retrieval methods.

In SIGIR ’95: Proceedings of the 18th Annual International ACM SIGIR Con-

ference on Research and Development in Information Retrieval, pages 282–290,

New York, NY, USA, 1995. ACM Press.

[94] Masahiro Fujita, Yusuke Matsunaga, and Taeko Kakuda. On variable ordering

of binary decision diagrams for the application of multi-level logic synthesis.

In EURO-DAC ’91: Proceedings of the Conference on European Design Au-

tomation, pages 50–54, Los Alamitos, CA, USA, 1991. IEEE Computer Society

Press.

[95] John Gallagher, Kim S. Henriksen, and Gourinath Banda. Techniques for Scal-

ing Up Analyses Based on Pre-interpretations. In Gopal Gupta Maurizio Gab-

brielli, editor, Logic Programming, 21st International Conference (ICLP 2005),

pages 280–296. Springer-Verlag, October 2005.

[96] John P. Gallagher. Program analysis and specialisation using tree automata.

Proceedings of Program Analysis and Transformation (PAT2005), June 2005.

[97] Herve Gallaire and Jack Minker. Logic and Data Bases. Plenum Press, New

York, 1978.

[98] Allen Van Gelder. The alternating fixpoint of logic programs with negation.

In Proceedings of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on

Principles of Database Systems, pages 185–221, San Diego, CA, USA, 1993.

Academic Press Professional, Inc.

[99] Michael Gelfond and Vladimir Lifschitz. The stable model semantics for logic

programming. In Robert A. Kowalski and Kenneth Bowen, editors, Proceedings

of the Fifth International Conference on Logic Programming, pages 1070–1080,

Cambridge, Massachusetts, 1988. The MIT Press.

BIBLIOGRAPHY 176

[100] Sergio Greco, Domenico Sacca, and Carlo Zaniolo. Extending stratified Dat-

alog to capture complexity classes ranging from P to QH. Acta Informatica,

37(10):699–725, 2001.

[101] Russell Greiner, Adam J. Grove, and Dan Roth. Learning cost-sensitive active

classifiers. Artificial Intelligence, 139(2):137–174, 2002.

[102] William G. Griswold, Darren C. Atkinson, and Collin McCurdy. Fast, flexible

syntactic pattern matching and processing. In WPC ’96: Proceedings of the 4th

International Workshop on Program Comprehension (WPC ’96), pages 144–

153, Washington, DC, USA, March 1996. IEEE Computer Society.

[103] David Grove and Craig Chambers. A framework for call graph construc-

tion algorithms. ACM Transactions on Programming Languages and Systems,

23(6):685–746, 2001.

[104] David Grove, Greg DeFouw, Jeffrey Dean, and Craig Chambers. Call graph

construction in object-oriented languages. In Proceedings of the 1997 ACM

SIGPLAN Conference on Object-Oriented Programming Systems, Languages &

Applications (OOPSLA ’97), pages 108–124, 1997.

[105] Orna Grumberg, Shlomi Livne, and Shaul Markovitch. Learning to order BDD

variables in verification. Journal of Artificial Intelligence Research, 18:83–116,

2003.

[106] Yann-Gael Gueheneuc, Pierre Cointe, and Marc Segura-Devillechaise. Java re-

flection exercises, correction, and FAQs. http://www.yann-gael.gueheneuc.

net/Work/Teaching/Documents/Practical-ReflectionCourse.doc.pdf,

2002.

[107] Hans W. Guesgen and Anne Philpott. Heuristics for solving fuzzy constraint

satisfaction problems. In ANNES ’95: Proceedings of the 2nd New Zealand

Two-Stream International Conference on Artificial Neural Networks and Expert

Systems, pages 132–135, Washington, DC, USA, 1995. IEEE Computer Society.

http://www.yann-gael.gueheneuc.net/Work/Teaching/Documents/Practical-ReflectionCourse.doc.pdf

http://www.yann-gael.gueheneuc.net/Work/Teaching/Documents/Practical-ReflectionCourse.doc.pdf

BIBLIOGRAPHY 177

[108] Peter Gulutzan and Trudy Pelzer. SQL-99 Complete: Really! R&D Books,

Lawrence, Kansas, 1999.

[109] Elnar Hajiyev. CodeQuest - source code querying with Datalog. Master’s thesis,

Programming Tools Group, University of Oxford, UK, September 2005.

[110] Elnar Hajiyev, Mathieu Verbaere, and Oege de Moor. CodeQuest: scalable

source code queries with Datalog. In ECOOP 2006 - Object-Oriented Program-

ming, 20th European Conference, Lecture Notes in Computer Science, pages

2–27, Nantes, France, July 2006. Springer.

[111] Elnar Hajiyev, Mathieu Verbaere, Oege de Moor, and Kris de Volder. Code-

quest: querying source code with Datalog. In Proceedings of the 20th Annual

ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Lan-

guages, and Applications, OOPSLA 2005, pages 102–103, New York, NY, USA,

2005. ACM Press.

[112] Alon Y. Halevy, Inderpal Singh Mumick, Yehoshua Sagiv, and Oded Shmueli.

Static analysis in Datalog extensions. J. ACM, 48(5):971–1012, 2001.

[113] Seth Hallem, Benjamin Chelf, Yichen Xie, and Dawson Engler. A system and

language for building system-specific, static analyses. In Proceedings of the 2002

ACM SIGPLAN Conference on Programming Language Design and Implemen-

tation (PLDI), pages 69–82, 2002.

[114] Christian Hammer. Parallelitatsanalyse fur slicing von Java threads. In Proceed-

ings of the 7th Workshop Software Reengineering der Gesellschaft fr Informatik

(WSR 2005), May 2005.

[115] Justin E. Harlow and Franc Brglez. Design of experiments in BDD variable

ordering: lessons learned. In ICCAD ’98: Proceedings of the 1998 IEEE/ACM

International Conference on Computer-Aided Design, pages 646–652, New York,


BIBLIOGRAPHY 178

[116] Timothy Harris. Early storage reclamation in a tracing garbage collector. ACM

SIGPLAN Notices, 34(4):46–53, 1999.

[117] Timothy Harris. Dynamic adaptive pre-tenuring. In International Symposium

on Memory Management (ISMM ’00), volume 36(1), pages 127–136, 2001.

[118] Rebecca Hasti and Susan Horwitz. Using static single assignment form to

improve flow-insensitive pointer analysis. In Proceedings of the ACM SIG-

PLAN’98 Conference on Programming Language Design and Implementation

(PLDI), pages 97–105, New York, NY, USA, 1998. ACM Press.

[119] Les Hatton. Balancing static and dynamic testing: some observations from

measurement. Invited as part of visiting scientists series, Nokia Research Labs,

Helsinki, 2000.

[120] Nevin Heintze and Olivier Tardieu. Ultra-fast aliasing analysis using CLA: A

million lines of C code in a second. In Proceedings of the 2001 ACM SIGPLAN

Conference on Programming Language Design and Implementation (PLDI),

pages 254–263, June 2001.

[121] Kim Henriksen and John Gallagher. A web-based tool combining different type

analyses. In 16th Workshop on Logic-based Methods in Programming Environ-

ments (WLPE2006), pages 184–196, Seattle, Washington, August 2006.

[122] Kim S. Henriksen and John P. Gallagher. Abstract interpretation of PIC

programs through logic programming. In SCAM ’06: Proceedings of the

Sixth IEEE International Workshop on Source Code Analysis and Manipula-

tion (SCAM’06), pages 184–196, Washington, DC, USA, 2006. IEEE Computer

Society.

[123] Michael Hind. Pointer analysis: Haven’t we solved this problem yet? In Pro-

ceedings of the 2001 ACM SIGPLAN-SIGSOFT Workshop on Program Analysis

For Software Tools and Engineering, PASTE’01, Snowbird, UT, 2001.

BIBLIOGRAPHY 179

[124] Martin Hirzel, Amer Diwan, and Matthew Hertz. Connectivity-based garbage

collection. In Proceedings of the 2003 ACM SIGPLAN Conference on Object-

Oriented Programming Systems, Languages and Applications, OOPSLA 2003,

pages 359–373, October 2003.

[125] Martin Hirzel, Amer Diwan, and Michael Hind. Pointer analysis in the presence

of dynamic class loading. In ECOOP 2004 - Object-Oriented Programming, 18th

European Conference, pages 96–122, 2004.

[126] Martin Hirzel, Johannes Henkel, Amer Diwan, and Michael Hind. Understand-

ing the connectivity of heap objects. In International Symposium on Memory

Management (ISMM), pages 36–39, jun 2002.

[127] Susan Horwitz. Precise flow-insensitive may-alias analysis is NP-hard. ACM

Transactions on Programming Languages and Systems, 19(1):1–6, 1997.

[128] Susan Horwitz and Tim Teitelbaum. Generating editing envornmnets based on

attributes and relations. ACM Transactions on Programming Languages and

Systems, 8(4), 1986.

[129] Alan J. Hu, David L. Dill, Andreas J. Drexler, and C. Han Yang. Higher-level

specification and verification with BDDs. In Workshop on Computer-Aided

Verification, volume 663 of Lecture Notes in Computer Science, pages 82–95,

1992.

[130] William N.N. Hung and Xiaoyu Song. BDD variable ordering by scatter search.

In ICCD ’01: Proceedings of the International Conference on Computer Design:

VLSI in Computers & Processors, pages 368–373, Washington, DC, USA, 2001.

IEEE Computer Society.

[131] Takashi Ishio, Ryusuke Niitani, and Katsuro Inoue. Towards locating a func-

tional concern based on a program slicing technique. In 2nd Asian Workshop

on Aspect-Oriented Software Development (AOAsia 2006), 2006.

BIBLIOGRAPHY 180

[132] Nagisa Ishiura, Hiroshi Sawada, and Shuzo Yajima. Minimization of binary

decision diagrams based on exchanges of variables. In IEEE/ACM International

Conference on Computer-Aided Design, ICCAD-91, pages 472–475, 1991.

[133] Mizuho Iwaihara and Yusaku Inoue. Bottom-up evaluation of logic programs

using binary decision diagrams. In ICDE ’95: Proceedings of the Eleventh

International Conference on Data Engineering, pages 467–474, 1995.

[134] Stan Jarzabek. Design of flexible static program analyzers with pql. IEEE

Transactions on Software Engineering, 24(3):197–215, 1998.

[135] Rob Johnson and David Wagner. Finding user/kernel pointer bugs with type

inference. In Proceedings of the 2004 USENIX Security Conference, pages 119–

134, 2004.

[136] Capers Jones. Software productivity group, 2003.

[137] David B. Kemp, Divesh Srivastava, and Peter J. Stuckey. Magic sets and

bottom-up evaluation of well-founded models. In Vijay Saraswat and Kazunori

Ueda, editors, Logic Programming, Proceedings of the 1991 International Sym-

posium, pages 337–354, San Diego, USA, 1991. The MIT Press.

[138] David B. Kemp, Divesh Srivastava, and Peter J. Stuckey. Bottom-up evaluation

and query optimization of well-founded models. Theoretical Computer Science,

146(1–2):145–184, 1995.

[139] Gary A. Kildall. A unified approach to global program optimization. In Confer-

ence Record of the ACM Symposium on Principles of Programming Languages,


[140] Aleks Kissinger and John C. Hale. Lopol: A deductive database approach to

policy analysis and rewriting. In Proceedings of the Second Annual Security-

enhanced Linux Symposium, Baltimore, Maryland, USA, March 2006.

[141] Aleks Kissinger and David Wheeler. bddshell - a frontend for deductive program

analysis. http://bddshell.sourceforge.net, 2005.

http://bddshell.sourceforge.net

BIBLIOGRAPHY 181

[142] John Kodumal. Program Analysis with Regularly Annotated Constraints. PhD

thesis, University of California, Berkeley, May 2006.

[143] Phokion G. Kolaitis. The expressive power of stratified logic programs. Infor-

mation and Computation, 90(1):50–66, 1991.

[144] Shuvendu K. Lahiri, Randal E. Bryant, and Byron Cook. A symbolic approach

to predicate abstraction. In Proceedings of the International Conference on

Computer-Aided Verification (CAV’03), 2003.

[145] Monica S. Lam. Challenges and new directions in program analysis. Keynote

speech at Parallel Architectures and Compilation Techniques (PACT), Septem-

ber 2003.

[146] William Landi, Barbara G. Ryder, and Sean Zhang. Interprocedural modifi-

cation side effect analysis with pointer aliasing. In Proceedings of the ACM

SIGPLAN’93 Conference on Programming Language Design and Implementa-

tion (PLDI), pages 56–67, June 1993.

[147] Alon Y. Levy and Yehoshua Sagiv. Semantic query optimization in Datalog

programs. In Proceedings of the Fourteenth ACM SIGACT-SIGMOD-SIGART

Symposium on Principles of Database Systems, pages 163–173, 1995.

[148] David D. Lewis and William A. Gale. A sequential algorithm for training

text classifiers. In W. Bruce Croft and Cornelis J. van Rijsbergen, editors,

Proceedings of SIGIR-94, 17th ACM International Conference on Research and

Development in Information Retrieval, pages 3–12, Dublin, IE, 1994. Springer

Verlag, Heidelberg, DE.

[149] Ondrej Lhotak and Laurie J. Hendren. Context-sensitive points-to analysis: Is it

worth it? In Alan Mycroft and Andreas Zeller, editors, Compiler Construction,

15th International Conference, CC 2006, Held as Part of the Joint European

Conferences on Theory and Practice of Software, ETAPS 2006, volume 3923 of

Lecture Notes in Computer Science, pages 47–64. Springer, 2006.

BIBLIOGRAPHY 182

[150] Ondrej Lhotak. Program Analysis using Binary Decision Diagrams. PhD thesis,

McGill University, January 2006.

[151] Ondrej Lhotak and Laurie J. Hendren. Scaling Java points-to analysis using

Spark. In Proceedings of the 12th International Conference on Compiler Con-

struction, pages 153–169, April 2003.

[152] Ondrej Lhotak and Laurie J. Hendren. Jedd: a BDD-based relational extension

of Java. In Proceedings of the ACM SIGPLAN 2004 Conference on Program-

ming Language Design and Implementation 2004, pages 158–169, 2004.

[153] Jorn Lind-Nielsen. BuDDy, a binary decision diagram package. http://www.

itu.dk/research/buddy/, 2004.

[154] Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Ad-

dison-Wesley, 2nd edition, 1999.

[155] Mark A. Linton. Implementing relational views of programs. In SDE 1: Proceed-

ings of the First ACM SIGSOFT/SIGPLAN Software Engineering Symposium

on Practical Software Development Environments, pages 132–140, New York,


[156] Yanhong A. Liu and Scott D. Stoller. From Datalog rules to efficient programs

with time and space guarantees. In PPDP ’03: Proceedings of the 5th ACM

SIGPLAN International Conference on Principles and Practice of Declarative

Programming, pages 172–183. ACM Press, 2003.

[157] Benjamin Livshits, John Whaley, and Monica S. Lam. Reflection analysis for

Java. Technical report, Stanford University, November 2005.

[158] Benjamin Livshits, John Whaley, and Monica S. Lam. Reflection analysis for

Java. In Proceedings of Programming Languages and Systems: Third Asian

Symposium, November 2005.

http://www.itu.dk/research/buddy/

http://www.itu.dk/research/buddy/

BIBLIOGRAPHY 183

[159] V. Benjamin Livshits and Monica S. Lam. Finding security errors in Java

programs with static analysis. In Proceedings of the 14th USENIX Security

Symposium, pages 271–286, August 2005.

[160] V. Benjamin Livshits and Monica S. Lam. Finding security vulnerabilities in

Java applications with static analysis. In 14th USENIX Security Symposium.

USENIX, August 2005.

[161] Roman Manevich, G. Ramalingam, John Field, Deepak Goyal, and Mooly Sa-

giv. Compactly representing first-order structures for static analysis. In Static

Analysis, 9th International Symposium, SAS 2002, pages 196–212, September

2002.

[162] Wiktor Marek and Miros law Truszczynski. Autoepistemic logic. J. ACM,

38(3):587–618, 1991.

[163] Michael C. Martin, V. Benjamin Livshits, and Monica S. Lam. Finding applica-

tion errors using PQL: a program query language. In Proceedings of the 20th An-

nual ACM SIGPLAN Conference on Object-Oriented Programming, Systems,

Languages, and Applications, OOPSLA 2005, October 2005.

[164] Edward McCormick and Kris De Volder. JQuery: finding your way through

tangled code. In Proceedings of the 19th Annual ACM SIGPLAN Conference

on Object-Oriented Programming, Systems, Languages, and Applications, OOP-

SLA 2004, pages 9–10, New York, NY, USA, 2004. ACM Press.

[165] Ken McMillian. Carnegie Mellon SMV. http://www.cs.cmu.edu/

∼modelcheck/smv.html, 1998.

[166] Ken McMillian. Cadence SMV. http://www.cadence.com/webforms/cbl

software/, 2002.

[167] M. Ray Mercer, Rohit Kapur, and Don E. Ross. Functional approaches to gener-

ating orderings for efficient symbolic representations. In DAC ’92: Proceedings

http://www.cs.cmu.edu/~modelcheck/smv.html

http://www.cs.cmu.edu/~modelcheck/smv.html

http://www.cadence.com/webforms/cbl_software/

http://www.cadence.com/webforms/cbl_software/

BIBLIOGRAPHY 184

of the 29th ACM/IEEE Conference on Design Automation, pages 624–627, Los

Alamitos, CA, USA, 1992. IEEE Computer Society Press.

[168] Ana Milanova, Atanas Rountev, and Barbara G. Ryder. Precise call graph

construction in the presence of function pointers. Technical report, Rutgers

University, 2001.

[169] Ana Milanova, Atanas Rountev, and Barbara G. Ryder. Parameterized object

sensitivity for points-to and side-effect analyses for Java. In Proceedings of the

9th International Symposium on Software Testing and Analysis, pages 1–11,

July 2002.

[170] Ana Milanova, Atanas Rountev, and Barbara G. Ryder. Precise and efficient call

graph onstruction for programs with function pointers. Journal of Automated

Software Engineering, 2004.

[171] Ana Milanova, Atanas Rountev, and Barbara G. Ryder. Parameterized ob-

ject sensitivity for points-to analysis for Java. ACM Transactions on Software

Engineering and Methodology, 14(1):1–41, 2005.

[172] Kim Milvang-Jensen and Alan J. Hu. BDDNOW: A parallel BDD package.

In FMCAD ’98: Proceedings of the Second International Conference on For-

mal Methods in Computer-Aided Design, pages 501–507, London, UK, 1998.

Springer-Verlag.

[173] Shinichi Minato. Zero-suppressed BDDs for set manipulation in combinatorial

problems. In DAC ’93: Proceedings of the 30th International Conference on

Design Automation, pages 272–277, New York, NY, USA, 1993. ACM Press.

[174] Jack Minker. Logic and databases: A 20 year retrospective. In LID ’96: Proceed-

ings of the International Workshop on Logic in Databases, pages 3–57, London,

UK, 1996. Springer-Verlag.

[175] Brian R. Murphy and Monica S. Lam. Program analysis with partial transfer

functions. In PEPM ’00: Proceedings of the 2000 ACM SIGPLAN Workshop on

BIBLIOGRAPHY 185

Partial Evaluation and Semantics-based Program Manipulation, pages 94–103,


[176] Mayur Naik and Alex Aiken. Conditional must not aliasing for static race

detection. In Proceedings of the 34th ACM SIGPLAN-SIGACT Symposium on

Principles of Programming Languages, POPL 2007, pages 327–338, New York,


[177] Mayur Naik, Alex Aiken, and John Whaley. Effective static race detection for

Java. In Proceedings of the ACM SIGPLAN 2006 Conference on Programming

Language Design and Implementation, pages 308–319, New York, NY, USA,

2006. ACM Press.

[178] National Institute of Standards and Technology (NIST), Department of Com-

merce. Software errors cost U.S. economy $59.5 billion annually. NIST News

Release 2002-10, 2002.

[179] Jeffrey F. Naughton and Raghu Ramakrishnan. Bottom-up evaluation of logic

programs. In Computational Logic — Essays in Honor of Alan Robinson, pages

640–700, 1991.

[180] Jeffrey F. Naughton, Raghu Ramakrishnan, Yehoshua Sagiv, and Jeffrey D.

Ullman. Efficient evaluation of right-, left-, and multi-linear rules. In SIG-

MOD ’89: Proceedings of the 1989 ACM SIGMOD International Conference

on Management of Data, pages 235–242. ACM Press, 1989.

[181] Erik M. Nystrom, Hong-Seok Kim, and Wen mei W. Hwu. Importance of heap

specialization in pointer analysis. In Proceedings of the 2004 ACM SIGPLAN-

SIGSOFT Workshop on Program Analysis For Software Tools and Engineering,

PASTE’04, pages 43–48, New York, NY, USA, 2004. ACM Press.

[182] Robert O’Callahan. Generalized Aliasing as a Basis for Program Analysis Tools.

PhD thesis, Carnegie Mellon University, November 2000.

BIBLIOGRAPHY 186

[183] Robert O’Callahan and Daniel Jackson. Lackwit: A program understanding

tool based on type inference. In Proceedings of the 19th International Conference

on Software Engineering, pages 338–348, May 1997.

[184] Hiroyuki Ochi, Shuzo Yajima, and Nagisa Ishiura. A vector algorithm for ma-

nipulating Boolean functions based on shared binary decision diagrams. Super-

computer, 8(6):101–118, November 1991.

[185] Shipra Panda and Fabio Somenzi. Who are the variables in your neighborhood.

In ICCAD ’95: Proceedings of the 1995 IEEE/ACM International Conference

on Computer-Aided Design, pages 74–77, Washington, DC, USA, 1995. IEEE

Computer Society.

[186] Hemant D. Pande and Barbara G. Ryder. Data-flow-based virtual function

resolution. In Static Analysis, Third International Symposium, SAS’96, pages

238–254. Springer-Verlag, 1996.

[187] Santanu Paul. SCRUPLE: a reengineer’s tool for source code search. In CAS-

CON ’92: Proceedings of the 1992 Conference of the Centre for Advanced Stud-

ies on Collaborative Research, pages 329–346. IBM Press, 1992.

[188] Santanu Paul and Ataul Prakash. A query algebra for program databases. IEEE

Transactions on Software Engineering, 22(3):202–217, 1996.

[189] David J. Pearce, Paul H. J. Kelly, and Chris Hankin. Efficient field-

sensitive pointer analysis for c. In Proceedings of the 2004 ACM SIGPLAN-

SIGSOFT Workshop on Program Analysis For Software Tools and Engineering,

PASTE’04, pages 37–42, New York, NY, USA, 2004. ACM Press.

[190] Prakash Prabhu and Priti Shankar. Field flow sensitive pointer and escape

analysis for Java as logic programs using ssa. Technical Report IISc-CSA-TR-

2006-10, CSA, IISc, India, September 2006.

[191] J. Ross Quinlan. Induction of decision trees. Machine Learning, 1(1):81–106,

1986.

BIBLIOGRAPHY 187

[192] Raghu Ramakrishnan, Divesh Srivastava, and S. Sudarshan. Rule ordering in

bottom-up fixpoint evaluation of logic programs. In Proceedings of the 16th

International Conference on Very Large Data Bases, pages 359–371, 1990.

[193] Raghu Ramakrishnan, Divesh Srivastava, and S. Sudarshan. Controlling the

search in bottom-up evaluation. In Proceedings of the Joint International Con-

ference and Symposium on Logic Programming, pages 273–287, Washington,

USA, 1992. The MIT Press.

[194] Raghu Ramakrishnan and Jeffrey D. Ullman. A survey of research on deductive

database systems. Journal of Logic Programming, 23(2):125–149, 1993.

[195] G. Ramalingam. Identifying loops in almost linear time. ACM Transactons on

Programming Languages and Systems, 21(2):175–188, March 1999.

[196] Rajeev K. Ranjan, Adnan Aziz, Robert K. Brayton, Bernard Plessier, and

Carl Pixley. Efficient BDD algorithms for FSM synthesis and verification. In

Proceedings of the International Workshop on Logic Synthesis, May 1995. Poster

presentation.

[197] A. Rauzy. Toupie: A constraint language for model checking. In A. Podelski,

editor, Constraint Programming: Basics and Trends, pages 193–208. Springer,

1995.

[198] Derek Rayside, Steve Reuss, Erik Hedges, and Kostas Kontogiannis. The effect

of call graph construction algorithms for object-oriented programs on automatic

clustering. In Proceedings of the 8th International Workshop on Program Com-

prehension, pages 191–200. IEEE Computer Society, 2000.

[199] Jakob Rehof and Manuel Fahndrich. Type-based flow analysis: From poly-

morphic subtyping to CFL-reachability. In Conference Record of POPL 2001:

The 28th ACM SIGPLAN-SIGACT Symposium on Principles of Programming

Languages, pages 44–56, 2001.

BIBLIOGRAPHY 188

[200] Thomas Reps. Demand interprocedural program analysis using logic databases.

Applications of Logic Databases, pages 163–196, 1994.

[201] Thomas Reps. Solving demand versions of interprocedural analysis problems.

In Proceedings of the Fifth International Conference on Compiler Construction,

pages 389–403, April 1994.

[202] Thomas Reps, Susan Horwitz, and Mooly Sagiv. Precise interprocedural

dataflow analysis via graph reachability. In Conference Record of POPL’97:

The 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming

Languages, pages 49–61, San Francisco, California, January 1995.

[203] Martin Rinard and John Whaley. Compositional pointer and escape analysis

for multithreaded Java programs. Technical Report MIT-LCS-TR-795, MIT

Laboratory for Computer Science, November 1999.

[204] Atanas Rountev, Ana Milanova, and Barbara G. Ryder. Points-to analysis for

Java using annotated constraints. In Proceedings of the 2001 ACM SIGPLAN

Conference on Object-Oriented Programming Systems, Languages and Applica-

tions, OOPSLA 2001, October 2001.

[205] Richard Rudell. Dynamic variable ordering for ordered binary decision dia-

grams. In ICCAD ’93: Proceedings of the 1993 IEEE/ACM International Con-

ference on Computer-Aided Design, pages 42–47, 1993.

[206] Erik Ruf. Partial evaluation in reflective system implementations. In Workshop

on Reflection and Metalevel Architecture, October 1993.

[207] Erik Ruf. Context-insensitive alias analysis reconsidered. In Proceedings of the

ACM SIGPLAN’95 Conference on Programming Language Design and Imple-

mentation (PLDI), pages 13–22, 1995.

[208] Radu Rugina and Martin Rinard. Automatic parallelization of divide and con-

quer algorithms. In PPoPP ’99: Proceedings of the Seventh ACM SIGPLAN

BIBLIOGRAPHY 189

Symposium on Principles and Practice of Parallel Programming, pages 72–83,


[209] Zsofi Ruttkay. Fuzzy constraint satisfaction. In Proceedings of the 1st IEEE

Conference on Evolutionary Computing, pages 542–547, Orlando, 1994.

[210] Alan Ryan. Publication to appear, 2007.

[211] Barbara G. Ryder. Dimensions of precision in reference analysis of object-

oriented programming languages. In Gorel Hedin, editor, CC, volume 2622 of

Lecture Notes in Computer Science, pages 126–137. Springer, 2003.

[212] Yehoshua Sagiv. Optimizing Datalog programs. In Proceedings of the Sixth

ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Sys-

tems, pages 349–362. ACM Press, 1987.

[213] Konstantinos Sagonas, Terrance Swift, and David S. Warren. XSB as an efficient

deductive database engine. In SIGMOD ’94: Proceedings of the 1994 ACM

SIGMOD International Conference on Management of Data, pages 442–453.

ACM Press, 1994.

[214] Gerard Salton and Chris Buckley. Improving retrieval performance by relevance

feedback. Technical report, Cornell University, 1988.

[215] Jagesh V. Sanghavi, Rajeev K. Ranjan, Robert K. Brayton, and Alberto

Sangiovanni-Vincentelli. High performance BDD package by exploiting memory

hierarchy. In DAC ’96: Proceedings of the 33rd Annual Conference on Design

Automation, pages 635–640, New York, NY, USA, 1996. ACM Press.

[216] Patrick W. Sathyanathan. Interprocedural Dataflow Analysis - Alias Analysis.

PhD thesis, Stanford University, June 2001.

[217] Joel Scambray and Mike Shema. Web Applications (Hacking Exposed). Addison-

Wesley Professional, 2002.

BIBLIOGRAPHY 190

[218] John S. Schlipf. The expressive powers of the logic programming semantics

(extended abstract). In Proceedings of the Ninth ACM SIGACT-SIGMOD-

SIGART Symposium on Principles of Database Systems, pages 196–204, New

York, NY, USA, 1990. ACM Press.

[219] Johann Schumann. Efficient Theorem Provers based on an Abstract Machine.

PhD thesis, Technische Universitat Munchen, 1991.

[220] Umesh Shankar, Kunal Talwar, Jeffrey S. Foster, and David Wagner. Detecting

format string vulnerabilities with type qualifiers. In Proceedings of the 10th

USENIX Security Symposium, pages 201–220, August 2001.

[221] Micha Sharir and Amir Pnueli. Two approaches to interprocedural data flow

analysis. In Program Flow Analysis: Theory and Applications, pages 189–233.

Prentice-Hall, Englewood Cliffs, N.J., 1981.

[222] Olin Shivers. Control-Flow Analysis of Higher-Order Languages. PhD thesis,

Carnegie Mellon University, May 1991.

[223] Detlef Sieling. The nonapproximability of OBDD minimization. Information

and Computation, 172(2):103–138, 2002.

[224] Ganesh Sittampalam, Oege de Moor, and Ken Friis Larsen. Incremental ex-

ecution of transformation specifications. In Proceedings of the 31st ACM

SIGPLAN-SIGACT Symposium on Principles of Programming Languages,

POPL 2004, pages 26–38. ACM Press, 2004.

[225] N. J. A. Sloane. The on-line encyclopedia of integer sequences: A000670. http:

//www.research.att.com/as/sequences, 2003.

[226] Fabio Somenzi. CUDD: CU decision diagram package release, 1998.

[227] Bjarne Steensgaard. Points-to analysis in almost linear time. In Conference

Record of POPL’96: The 23rd ACM SIGPLAN-SIGACT Symposium on Prin-

ciples of Programming Languages, pages 32–41, January 1996.

http://www.research.att.com/as/sequences

http://www.research.att.com/as/sequences

BIBLIOGRAPHY 191

[228] Sun Microsystems. Java Cryptography Extension (JCE). http://java.sun.

com/products/jce, 2003.

[229] Vijay Sundaresan, Laurie J. Hendren, Chrislain Razafimahefa, Raja Vallee-Rai,

Patrick Lam, Etienne Gagnon, and Charles Godin. Practical virtual method call

resolution for Java. In Proceedings of the 2000 ACM SIGPLAN Conference on


2000), pages 264–280, 2000.

[230] Hisao Tamaki and Taisuke Sato. OLD resolution with tabulation. In Third

International Conference on Logic Programming, pages 84–98, July 1986.

[231] Robert E. Tarjan. Testing flow graph reducibility. Journal of Computer and

System Sciences, 9(3):355–365, December 1974.

[232] Robert Endre Tarjan. Data structures and network algorithms. Society for

Industrial and Applied Mathematics, Philadelphia, PA, USA, 1983.

[233] Peter Thiemann. Towards partial evaluation of full Scheme. In Reflection ’96,

1996.

[234] Frank Tip, Chris Laffra, Peter F. Sweeney, and David Streeter. Practical

experience with an application extractor for Java. ACM SIGPLAN Notices,

34(10):292–305, 1999.

[235] Frank Tip and Jens Palsberg. Scalable propagation-based call graph construc-

tion algorithms. ACM SIGPLAN Notices, 35(10):281–293, 2000.

[236] Aaron Tomb and Cormac Flanagan. Automatic type inference via partial eval-

uation. In PPDP ’05: Proceedings of the 7th ACM SIGPLAN International

Conference on Principles and Practice of Declarative Programming, pages 106–

116, New York, NY, USA, 2005. ACM Press.

[237] Jeffrey D. Ullman. Bottom-up beats top-down for Datalog. In Proceedings

of the Eighth ACM SIGACT-SIGMOD-SIGART Symposium on Principles of

Database Systems, pages 140–149, 1989.

http://java.sun.com/products/jce

http://java.sun.com/products/jce

BIBLIOGRAPHY 192

[238] Jeffrey D. Ullman. Principles of Database and Knowledge-Base Systems. Com-

puter Science Press, Rockville, Md., volume II edition, 1989.

[239] Allen van Gelder, Kenneth Ross, and John S. Schlipf. The well-founded seman-

tics for general logic programs. Journal of the ACM, 38(3):620–650, 1991.

[240] David Wagner, Jeffrey S. Foster, Eric A. Brewer, and Alexander Aiken. A

first step towards automated detection of buffer overrun vulnerabilities. In

Proceedings of Network and Distributed Systems Security Symposium, pages 3–

17, 2000.

[241] William E. Weihl. Interprocedural data flow analysis in the presence of pointers,

procedure variables, and label variables. In Conference Record of the Seventh

Annual ACM Symposium on Principles of Programming Languages, pages 83–

94, New York, NY, USA, 1980. ACM Press.

[242] W. Peter Weijland. Semantics for logic programs without occur check. Theo-

retical Computer Science, 71(1):155–174, 1990.

[243] John Whaley. Dynamic optimization thru the use of automatic runtime spe-

cialization. Master’s thesis, Massachusetts Institute of Technology, May 1999.

[244] John Whaley. A portable sampling-based profiler for Java virtual machines. In

Proceedings of the ACM 2000 Conference on Java Grande, pages 78–87. ACM

Press, June 2000.

[245] John Whaley. Partial method compilation using dynamic profile information. In

Proceedings of the 2001 ACM SIGPLAN Conference on Object-Oriented Pro-

gramming Systems, Languages and Applications, OOPSLA 2001, pages 166–

179. ACM Press, October 2001.

[246] John Whaley. System checkpointing using reflection and program analysis.

In A. Yonezawa and S. Matsuoka, editors, Proceedings of Reflection 2001, the

Third International Conference on Metalevel Architectures and Separation of

BIBLIOGRAPHY 193

Crosscutting Concerns, volume 2192 of Lecture Notes in Computer Science,

pages 44–51, Kyoto, Japan, September 2001. Springer-Verlag.

[247] John Whaley. Joeq: A virtual machine and compiler infrastructure. In Pro-

ceedings of the SIGPLAN Workshop on Interpreters, Virtual Machines, and

Emulators, pages 58–66, June 2003.

[248] John Whaley. JavaBDD library, 2004. http://javabdd.sourceforge.net.

[249] John Whaley. Joeq: A virtual machine and compiler infrastructure. Science of

Computer Programming, 57(3):339–356, 2005.

[250] John Whaley, Dzintars Avots, Michael Carbin, and Monica S. Lam. Using

Datalog with binary decision diagrams for program analysis. In Proceedings

of Programming Languages and Systems: Third Asian Symposium, November

2005.

[251] John Whaley and Christos Kozyrakis. Heuristics for profile-driven method-level

speculative parallelization. In Proceedings of the 33rd International Conference

on Parallel Processing (ICPP 2005). IEEE Computer Society, June 2005.

[252] John Whaley and Monica S. Lam. An efficient inclusion-based points-to analysis

for strictly-typed languages. In Proceedings of the 9th International Symposium

on Software Testing and Analysis, pages 180–195, July 2002.

[253] John Whaley and Monica S. Lam. Cloning-based context-sensitive pointer alias

analysis using binary decision diagrams. In Proceedings of the ACM SIGPLAN

2004 Conference on Programming Language Design and Implementation 2004,

pages 131–144. ACM Press, 2004.

[254] John Whaley, Michael C. Martin, and Monica S. Lam. Automatic extraction

of object-oriented component interfaces. In Proceedings of the International

Symposium on Software Testing and Analysis, pages 218–228. ACM Press, July

2002.

http://javabdd.sourceforge.net

BIBLIOGRAPHY 194

[255] John Whaley and Martin Rinard. Compositional pointer and escape analysis

for Java programs. In Proceedings of the 1999 ACM SIGPLAN Conference on


’99), pages 187–206, November 1999.

[256] John Whaley, Christopher Unkel, and Monica S. Lam. A BDD-based deductive

database for program analysis. http://suif.stanford.edu/bddbddb, 2004.

[257] Robert P. Wilson. Efficient, context-sensitive pointer analysis for C programs.

PhD thesis, Stanford University, 1997.

[258] Robert P. Wilson, Robert S. French, Christopher S. Wilson, Saman P. Amaras-

inghe, Jennifer-Ann M. Anderson, Steven W. K. Tjiang, Shih-Wei Liao, Chau-

Wen Tseng, Mary W. Hall, Monica S. Lam, and John L. Hennessy. SUIF:

An infrastructure for research on parallelizing and optimizing compilers. ACM

SIGPLAN Notices, 29(12):31–37, December 1994.

[259] Robert P. Wilson and Monica S. Lam. Efficient context-sensitive pointer anal-

ysis for C programs. In Proceedings of the ACM SIGPLAN’95 Conference on

Programming Language Design and Implementation (PLDI), pages 1–12, June

1995.

[260] Yichen Xie and Alex Aiken. Context- and path-sensitive memory leak detec-

tion. In Proceedings of the 10th European Software Engineering Conference held

jointly with 13th ACM SIGSOFT International Symposium on Foundations of

Software Engineering, pages 115–125, New York, NY, USA, 2005. ACM Press.

[261] Bwolen Yang, Randal E. Bryant, David R. O’Hallaron, Armin Biere, Olivier

Coudert, Geert Janssen, Rajeev K. Ranjan, and Fabio Somenzi. A performance

study of bdd-based model checking. In FMCAD ’98: Proceedings of the Second

International Conference on Formal Methods in Computer-Aided Design, pages

255–289, London, UK, 1998. Springer-Verlag.

http://suif.stanford.edu/bddbddb

BIBLIOGRAPHY 195

[262] Bwolen Yang, Yirng-An Chen, Randal E. Bryant, and David R. O’Hallaron.

Space- and time-efficient BDD construction via working set control. In Asia

and South Pacific Design Automation Conference, pages 423–432, 1998.

[263] Junfeng Yang, Can Sar, Paul Twohey, Cristian Cadar, and Dawson Engler.

Automatically generating malicious disks using symbolic execution. In SP ’06:

Proceedings of the 2006 IEEE Symposium on Security and Privacy (S&P’06),

pages 243–257, Washington, DC, USA, 2006. IEEE Computer Society.

[264] Tuba Yavuz-Kahveci and Tevfik Bultan. Automated verification of concurrent

linked lists with counters. In Static Analysis, 9th International Symposium,

SAS 2002, pages 69–84, September 2002.

[265] Suan Hsi Yong, Susan Horwitz, and Thomas Reps. Pointer analysis for programs

with structures and casting. In Proceedings of the 1999 ACM SIGPLAN Con-

ference on Programming Language Design and Implementation (PLDI), pages

91–103, New York, NY, USA, 1999. ACM Press.

[266] Sean Zhang. Practical Pointer Aliasing Analyses for C. PhD thesis, Rutgers

University, August 1998.

[267] Weilei Zhang and Barbara G. Ryder. Constructing accurate application call

graphs for Java to model library callbacks. In SCAM ’06: Proceedings of the

Sixth IEEE International Workshop on Source Code Analysis and Manipula-

tion (SCAM’06), pages 63–74, Washington, DC, USA, 2006. IEEE Computer

Society.

[268] Weilei Zhang and Barbara G. Ryder. A semantics-based definition for interclass

test dependence. Technical Report DCS-TR-597, Rutgers University, January

2006.

[269] Neng-Fa Zhou and Taisuke Sato. Efficient fixpoint computation in linear

tabling. In PPDP ’03: Proceedings of the 5th ACM SIGPLAN International

Conference on Principles and Practice of Declarative Programming, pages 275–

283. ACM Press, 2003.

BIBLIOGRAPHY 196

[270] Jianwen Zhu. Symbolic pointer analysis. In Proceedings of the International

Conference in Computer-Aided Design, pages 150–157, November 2002.

[271] Jianwen Zhu and Silvian Calman. Symbolic pointer analysis revisited. In Pro-

ceedings of the ACM SIGPLAN 2004 Conference on Programming Language

Design and Implementation 2004, June 2004.