francesco logozzo microsoft research, redmond, wa

40
Clousot A static contract checker based on abstract interpretation Francesco Logozzo Microsoft Research, Redmond, WA

Post on 19-Dec-2015

218 views

Category:

Documents


2 download

TRANSCRIPT

ClousotA static contract checker based on abstract interpretation

Francesco Logozzo

Microsoft Research, Redmond, WA

Demo!

Code Contracts

Idea: Use the IL as contract representationUse static methods to a contract library

Language agnostic: same for C#, VB, F# …

Code Contracts tools

Documentation generation (ccdoc)Automatic generation of documentation

Runtime checking (ccrewrite)Binary rewriting

Static checking (Clousot)

Abstract Interpretation

Theory of approximationsSemantics are order according to the precisionThe more the precise the semantics

The more the properties captured• A static analysis is a semantics• Precise enough to capture the properties

of interest• Rough enough to be computable

ClousotBased on Abstract Interpretation

≠ Usual approaches based on theorem proverAdvantages

Automatic Inference of loop invariants, pre, post, invariants

PredictableNo quantifier instantiationNo easy proofs by contradictory axioms

ScalableTune-up for the properties of interest

Clousot: The big picture

A.dll B.dll Z.dllC.dll …

Call Graph Construction

Contract Extraction

Analysis Inference

Assertion Checking

Method Analysis

1. Analyze the method2. Collect the proof obligations

Explicit: Pre/Post, assertionsImplicit: Array bounds, non-null …

3. Discharge proof obligationsIf not, emit warning message

4. Propagate inferred contracts

Bytecode

Stack language

Why the bytecode???

More faithfulCloser to what get executedClear semantics of the instructions

Exploit the work of the compilerName resolution, type inference, generics, LINQ…

Language agnosticBytecode does not change!

Languages yes : C# 2.0 → C# 3.0 → C# 4.0

Drawbacks

Explicit stackProgram structure lostExpressions chunked out…Need a program normalization!

Clousot: Analysis structure

AnalysesBounds, nonnull, arrays…

Expression analysis

Heap analysis

Stack analysis

Source: z = x + y

Expression recovery

Assume x + y ≤ 4High level: easy!Low level: problem!

Eager expression reconstruction?

MDTransform9000 straight line instructions

Lazy expression recovery

Value Analyses

Nonnull Is a reference null?

BoundsArray bounds, numerical values …

ArithmeticDivision by zero, negation of MinInt …

UnsafeBuffer overrun

Array content (with P. & R. Cousot)

StringsObject Invariants

(M. Monereau)Iterators

(S. Xia)

Pietro Ferrara, Francesco Logozzo and Manuel Fahndrich Safer Unsafe Code in .NET, in OOPSLA 2008

1. Numerical Abstract Domains

Abstract domains

0 ≤ index < array.Length?

index

a.Length

IntervalsO(n)

a ≤ x ≤ bNo

index

a.Length

PentagonsO(n)

a≤ x ≤ b & x <yYes

index

a.Length

OctagonsO(n3)

± x ± y ≤ aYes

index

a.Length

PolyhedraO(2n)

Σ aixi ≤ bYes

index

a.Length

Numerical domains in Clousot

Basic Intervals, Pentagons, Leq, Karr, Octagons, Simple Disequalities, Stripes, Subpolyhedra …

Combination of thereofTree of domains

Incremental analysisFirst analyze with “cheap” domainsMove to more expensive if fails to prove

Domain D1

Domain D2

Domain D3

Why Subpolyhedra?

Often proving a “easy” precondition requires a complex reasoning

From StringBuilder:

Subpolyhedra

∑ai xi ≤ k ⇔ ∑ai xi = β ⋀ β ≤ kReduced product of

IntervalsScalable, fast…

Linear EqualitiesPrecise join, fast …

Challenge: Have a precise Join

Vincent Laviron and Francesco Logozzo, Subpolyhedra: A (more) scalable approach to the inference of linear inequalities, in VMCAI 2009

Naif Join

assume x <= y

x = 0; y = 1

assert x <= y

⟨x - y == β, β ∈ [-∞, 0]⟩

⟨T, x ∈ [0,0] ⋀ y ∈ [1,1]⟩

⟨T, T⟩

Join algorithm : SubPolyhedra1. Uniform slack variables2. Reduce the states3. Do the pair-wise join4. Recover precision using deleted

equalities5. Recover precision using hints• Templates, 2D Convex Hull, Annotations

Vincent Laviron and Francesco Logozzo, Refining Abstract Interpretation-based Static Analyses with Hints, in APLAS 2009

Example : Join Step 1

Entry State:s0 : ⟨x - y == β, β ∈ [-∞, 0] ⟩s1 : ⟨T, x ∈ [0,0] ⋀ y ∈ [1,1]⟩

Step 1 (uniform slack variables) s’0 : ⟨x - y == β, β ∈ [-∞, 0] ⟩s’1 : ⟨x - y == β, x ∈ [0,0] ⋀ y ∈ [1,1]⟩

Example: Join steps 2-3

Step 2 (Reduction)s’’0 : ⟨x - y == β, β ∈ [-∞, 0]⟩

s’’1 : ⟨x - y == β, x ∈ [0,0] ⋀ y ∈ [1,1] ⋀ β ∈ [-1,-1]⟩

Step 3 (Pair-wise join)s2 : ⟨x - y == β, β ∈ [-∞, 0]⟩

Example: Join Step 4

Recover lost relations

assume x == y x = 0; y = 1

assert x<= y

⟨x - y == 0, T⟩⟨T, x ∈ [0,0] ⋀ y ∈

[1,1]⟩

⟨T, T⟩⟨x - y == β, β ∈ [-1, 0]⟩

Critical operation: Reduction

Infer tightest boundsInstance of a Linear programming problem

Solution in polynomial timeDrawbacks:

Numerical instability, Rounding errorsSimplex too slow for our purposes

Basis exploration (new)Based on static basis explorationLess concerned about numerical instability

Abstract when an error is detectedE.g. In a row operation, delete the row

To sum up on Subpolyhedra

Infer arbitrary linear inequalitiesScales to hundreds of variablesPrecisely propagate linear inequalitiesGive up some of the inference power

Family of abstract domainsTwo precision axes

HintsTune the inference power at join points

ReductionInfer the tightest intervals

2. Abstract domain for array content inference

Inferring array contents…public void Init(int N){ Contract.Requires(N > 0);

int[] a = new int[N]; int i = 0;

while (i < N) { a[i] = 222; i = i + 1; }

Contract.Assert(∀ k ∈ [0, N). a[k] == 222);}

If i == 0 then a not initializedelse if i > 0 a[0] == … a[i] == 222else impossible

Challenge 1:Effective handling of disjunction

Challenge 2:No overapproximation (can be unsound)(no hole, all the elements are initialized)

Our idea

Precise and very very fast!Basis: Array segments

[222, 222]0i, k

[0, 0] N

Segment bounds Uniform content abstraction

?

0 ≤ i, 0 ≤ k i == k i < N, k < N

Disjunction

ExampleContract.Requires(N > 0);int[] a = new int[N];

int i = 0;

assume i < N

a[i] = 222;

assume i ≥ N

j = i+1;

i -> _ j -> iN -> N

00 N

00,i N

00,i N

2220,i N01,i+1

?

2220,i N01,i+1,

j?

2220 N01,i ?

Segment unification00,i N 2220 N01,i ?

00 N⊥ i ? 2220 N01,i ?

00 N⊥ i ? 2220 N0i ?

2220 N0i ??

Join

Can be empty segments! (Disjunction)

ExampleContract.Requires(N > 0);int[] a = new int[N];

int i = 0;

assume i < N

a[i] = 222;

assume i ≥ N

j = i+1;

i -> _ j -> iN -> N

2220 N0i ??

2220 N0i ?

And so on up to a fixpoint

2220i, N

Remove doubts(i == N && N > 0)

We visited all the elements in [0, N)

Other…

Intra-modular Inference Pre/Post/Object invariantsReduce annotation burdenCan make the analysis bridleSerialize to C#

Backward analysis for disjunctionsSafe floating points in parametersSelective verificationRanking of warnings…

TODO

CollectionsExperimenting handling of arraysExtend to iterators, List<T> …

StringsNeed good domains to approximate strings

Modular overflow checkingCombine with automatic test generation

PEXMake Clousot parallel…

Conclusions

Programmers are willing to write annotations

SAL, ESP … at Microsoft,CodeContracts Forum, PDC …

We should provide valuable toolsAutomatic, predictable, fast!!!!

Clousot is a step in that directionDownload it today at:

http://msdn.microsoft.com/en-us/devlabs/(Academic and Commercial license)

Thanks!!!!

⟨T, x ∈ [0,1] ⋀ y ∈ [0,+∞]⟩

⟨T, x ∈ [0,0] ⋀ y ∈ [0,+∞]⟩

Example : Join Step 5

assume y >= 0 ;x = 0;

while x < y

x++;assert x == y ;

⟨T, x ∈ [0,0] ⋀ y ∈ [0,+∞]⟩

⟨T, x ∈ [0,0] ⋀ y ∈ [1,+∞]⟩

⟨T, x ∈ [1,1] ⋀ y ∈ [1,+∞]⟩

⟨T, x ∈ [0,0] ⋀ y ∈ [0,0]⟩

⟨ x – y == β’, x ∈ [0,1] ⋀ y ∈ [0,1] ⋀

β’ ∈ [0,0]⟩

⟨ x – y == β, x ∈ [0,1] ⋀ y ∈ [0,1] ⋀

β ∈ [0,+∞]⟩

⟨x – y == β’, x ∈ [0,1] ⋀ y ∈ [0,+∞] ⋀ β’ ∈ [-

∞,0]⟩