crystal-izing sophisticated code analyses

39
Crystal-izing Sophisticated Code Analyses Ciera Jaspan Kevin Bierhoff Jonathan Aldrich http://code.google.com/p/ crystalsaf

Upload: hewitt

Post on 02-Feb-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Crystal-izing Sophisticated Code Analyses. Ciera Jaspan Kevin Bierhoff Jonathan Aldrich http://code.google.com/p/crystalsaf. Installation. Pre-requisites Eclipse with Java Development Tools (JDT) Plugin Development Environment (PDE) Crystal Available from our Eclipse update site - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Crystal-izing Sophisticated Code Analyses

Crystal-izing SophisticatedCode Analyses

Ciera JaspanKevin BierhoffJonathan Aldrich

http://code.google.com/p/crystalsaf

Page 2: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

2

Installation

•Pre-requisites•Eclipse with

• Java Development Tools (JDT)• Plugin Development Environment (PDE)

•Crystal• Available from our Eclipse update site• http://crystalsaf.googlecode.com/svn/trunk/EclipseUpdate/

•The USB drive contains:•A version of Eclipse with Crystal already installed

•Samples of several null pointer analyses•Sample test code to run the null analysis on

•Tutorial Notes

Audience pre-reqs: familiarity withabstract interpretation!

Page 3: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

3

Crystal

•A framework for creating static analyses•AST walkers•Simple dataflow•Branch-sensitive dataflow•Use of specifications

•Primarily for educational purposes•Easy startup into a static analysis•Direct from theory to implementation•Incremental development of analyses•“Wow factor”: power of Eclipse

• Also found to be useful for research prototypes

Page 4: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

4

Crystal in the classroom

•Crystal is used in a professional masters program in software engineering•Course: Analysis of Software Artifacts•Students: Real-world experience, may not have (or want!) a theoretical background

•How can program analysis be used in industry, now and in the near future?

•Students will learn•What can affect the usability and precision of a static analysis

•What kind of problems static analysis can solve

Page 5: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

5

Crystal in research

•High transferability from paper to code made Crystal natural choice for research•Currently 4 research analyses written in Crystal•3 are published (OOPSLA 08, ECOOP 09, OOPSLA 09)

•Allows incremental development to more sophisticated features•Annotations with custom parsers•Branch-sensitivity•Automated testing•Widening v. regular join

Page 6: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

6

An incremental approach

•Students typically answer questions on paper first•We’ll note questions we can ask students in red!

•Then, students transfer that knowledge directly to code•We’ll note how to code the answer in blue!

•Use an incremental approach•Instructions on the wiki with verification points

•Today, everyone will make a simple flow analysis•We’ll move a little faster to create a smart flow analysis

•You can follow along with the code samples

Page 7: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

7

This tutorial

•Install Crystal•Register an analysis•Create a simple AST walker for nullness•Add a simple flow analysis•Add annotations•Add branch-sensitivity

•We’ll provide sample analyses and sample assignments for classes

•For more information, visit our wiki

Page 8: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

8

Everything installed right?

•Crystal menu is where analyses appear•Several built-in ones already available

Page 9: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

9

Steps for making an analysis

•Install Crystal•Register an analysis•Create a simple AST walker for nullness•Add a simple flow analysis•Add annotations•Add branch-sensitivity

Page 10: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

10

Register and Run

•Create a new plugin project•Make it depend on Crystal and JDT

•Implement ICrystalAnalysis•Register with the extension-point CrystalAnalysis in the plugin.xml file

•Select Run -> Run Configuration…•Make a new Eclipse configuration•Run! Your analysis name should appear in the Crystal menu.

Page 11: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

11

Steps for making an analysis

•Install Crystal•Register an analysis•Create a simple AST walker for nullness•Add a simple flow analysis•Add annotations•Add branch-sensitivity

Page 12: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

12

Use an AST walker

•We’re ready to make an analysis now.•What kinds of expressions do we want to check for an null pointer analysis?•Method calls•Field access•Array access

•In analyzeMethod(), create an ASTVisitor that gives an error when it encounters these operations.

Page 13: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

13

Everything running?

•Create a new project in the child Eclipse•Add code to analyze•Test with Crystal->MyAnalysis

•Can also run automated tests in JUnit•Will not cover that today•See wiki for more information

Page 14: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

14

Steps for making an analysis

•Install Crystal•Register an analysis•Create a simple AST walker for nullness•Add a simple flow analysis•Add annotations•Add branch-sensitivity

Page 15: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

15

Abstract interpretation concepts

•Lattice•The abstract states the program can be in

•Control flow graph•The order of control flow through the nodes of the AST

•Transfer functions•How the states change as the analysis encounters new program instructions

•Worklist algorithm•Traverses the control flow graph and runs the transfer functions

Page 16: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

16

•Lattice•A finite lattice of the abstract states the program can be in

•Control flow graph•The order of control flow through the nodes of the AST

•Transfer functions•How the states change as the analysis encounters new program instructions

•Worklist algorithm•Traverses the control flow graph and runs the transfer functions

Abstract interpretation concepts

Page 17: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

17

Lattice review

•Top of lattice represents least precise info•Bottom of lattice represents an unanalyzed element•Must have finite height to ensure termination•Unique least upper bound must exist for any two elements

a b

Page 18: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

18

Transfer function review

•Given•An instruction•An incoming lattice element

•Produce•An outgoing lattice element ’

•(instr, ) = ’•Make a different transfer function on each type of instruction

Page 19: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

19

•Map all variables to an element in the lattice above

•Tuple Lattice: A lattice which maps a key to an element in another lattice

A simple null analysis

is a map of every program variable to a null lattice element

• (x = null, ) =[xNULL]

• (x = y, ) =[x(y)]

• (x = new C(), ) =[xNOT_NULL]

• (x = y.m(z1,…,zn), ) =

[yNOT_NULL]•(array access, field access, etc.)

MAYBE_NULL

NOT_NULL NULL

Page 20: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

20

The lattice

•What are the elements in the lattice?•Bottom, null, not null, and maybe null

•Create a type which represents the elements•Crystal allows this to be an arbitrary type•This is likely an immutable type, like an enum

•Tuple Lattices•We will use TupleLatticeElement•Just create the type which represents the sub-lattice

public enum NullLatticeElement { BOTTOM, NULL, NOT_NULL, MAYBE_NULL;}

Page 21: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

21

The lattice

•What is the bottom-most element?•What is the the top-most?•What is the ordering of elements?•What does the join operation look like?

•Extend SimpleLatticeOperations<LE>•LE bottom()•boolean atLeastAsPrecise(LE, LE)•LE join(LE, LE)•LE copy(LE)

Page 22: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

22

Setting up the flow analysis

•What is the lattice element at the start of the method?•Everything may be null (except this is not null)

•Extend AbstractingTransferFunction•Implement createEntryValue() and getLatticeOperations()

•(Don’t override the transfer functions yet)

•We’ll also now have the visitor call the flow analysis

Page 23: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

23

Transfer functions

•Which instructions cause the lattice element to change? How do they change the lattice element?•Null: makes target null•Constructor: makes target non-null •Copying assignment: makes target same as right side

•In the derived transfer function, override the relevant instructions

Page 24: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

24

You now have a Crystal flow analysis

•And that’s it!•From here, we’re just going to improve it•Annotations: teach students power of specifications

•Branch-sensitivity: teach students power of abstractions closer to code

•We’ll move a little faster•Assistants are ready to help if you wish to follow along

Page 25: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

25

Why Three Address Code

•Does mean students work with both Eclipse AST and TAC•However•TAC has no sub-expressions•TAC has many fewer kinds of nodes

•Students able to understand TAC as it matched what they wrote down on paper

Page 26: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

26

Relevant packages

•edu.cmu.cs.crystal•Core package for analyses

•org.eclipse.jdt.core.dom•The Eclipse AST

•edu.cmu.cs.crystal.simple•Simple interfaces for flow analyses

•edu.cmu.cs.crystal.tac.model•The interfaces for three address code instructions

Page 27: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

27

Steps for making an analysis

•Install Crystal•Register an analysis•Create a simple AST walker for nullness•Add a simple flow analysis•Add annotations•Add branch-sensitivity

Page 28: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

28

Annotations

•What specifications could we add to make the analysis more precise?•Non-null on method parameters

•Create a Java annotation•Put it in a jar separate from your analysis

•Make it available to the code being analyzed

@Target({ElementType.METHOD, ElementType.PARAMETER})public @interface NonNull {}

Page 29: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

29

Annotations

•What transfer functions can use this annotation to improve precision?• Initial lattice information• Method call instruction (return value)

•Annotations are available from the Eclipse AST• But hard to get• No desugaring

•Use the AnnotationDatabase• Pass in an AnnotationDatabase to the transfer functions

• Query it to find instances of the @NotNull annotation

•Can also give Crystal a custom parser for complex annotations• @Invariant(“x == foo and y != bar”)

Page 30: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

30

Annotations

•Where can the visitor use annotations for additional checking?•Method call parameters and return value

•Constructor call parameters

•Use the AnnotationDatabase•Query it to find instances of the @NotNull annotation

•Check that parameter is not null

Page 31: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

31

Steps for making an analysis

•Install Crystal•Register an analysis•Create a simple AST walker for nullness•Add a simple flow analysis•Add annotations•Add branch-sensitivity

Page 32: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

32

Branch-sensitivity

•Take advantage of knowledge gained through tests

•Specify different exit paths through a method• An invariant that doesn’t hold on exceptional exit

•Labeled branches let us distinguish these

•Must return different lattice elements for each label

if (x != null) { //hey, it’s safe //to use x in here!}else { //but it’s an //error in here!}

Page 33: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

33

On paper…

•No branch sensitivity(x == y, ) =

•Branch sensitivityT(x == y, ) =

if ((x) < MAYBE_NULL )[y(x)]

else if ((y) < MAYBE_NULL )[x(y)]

else

•Separate definition for the false branch F(x == y, )

Page 34: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

34

Branching example

foo() if (foo()) { a; } else { b; } c;

public boolean foo()

a b

c

true false

Page 35: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

35

Branching example, with exceptions

foo()try { if (foo()) { a; } else { b; } c;}catch (MyException exp) { d;}e;

public boolean foo() throws MyException;

a b

c

d

e

true false

MyException

Page 36: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

36

Types of labels

•True/false•All conditionals (if, while, ?:, etc.)•Methods calls that return a boolean•Binary relational operators (&&, <, ==, etc.)

•Exceptional•Methods calls that throw exceptions•Throw statements•Catch and Finally statements

•Switch (used on switch)•Iterator (used on enhanced for)•Normal

Page 37: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

37

Changing to branch-sensitive analyses

1. Implement AbstractTACBranchSensitiveTransferFunction

2. Change signatures on transfer functions

3. Wrap return lattice in an IResult

At this point, transfer functions run as they did before

public LE transfer(TACInstruction instr, LE value)

public IResult<LE> transfer(TACInstruction instr, List<ILabel> labels, LE value)

return value;

return LabeledSingleResult.createResult(value, labels);

Page 38: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

38

Using the branches

•Which instructions can provide different information on each branch?•x == y•x != y

•Create a new LabeledResult with the labels and a default value•Copy the lattice element for each branch•Change the lattice elements•Put them into the labeled result with the right label

Page 39: Crystal-izing Sophisticated Code Analyses

http://code.google.com/p/crystalsaf

39

Crystal Static Analysis Framework

•Fast startup into a simple analysis•Direct from theory to implementation•Incremental sophistication of analysis•Full power of Eclipse infrastructure•Proven useful for both teaching and research•4 research analyses•Used for several years in a professional master’s course