abstract domains for the static analysis of programs manipulating complex...

56
Abstract domains for the static analysis of programs manipulating complex data-structures Xavier Rival INRIA August, 26th. 2011 Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-st August, 26th. 2011 1 / 47

Upload: others

Post on 23-Feb-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Abstract domains for the static analysis

of programs manipulating complex data-structures

Xavier Rival

INRIA

August, 26th. 2011

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 1 / 47

Page 2: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Introduction

Software verification

My research topic: software verification

Safety (absence of runtime errors, non desired behaviors), functional

properties...

Those are all undecidable properties

Abstract interpretation approach: sound, automatic but incomplete

Abstraction:

use of conservative approximationsb

b

b

bb

b

bb

α

b

b

b

bb

b

bb

Abstract transfer functions,

preserving conservativeapproximations b

b

b

b

b

b

f

f ♯

Widening: terminating over-approximation of concrete join

Abstract domain: abstraction + transfer function + widening

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 2 / 47

Page 3: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Introduction

Verification of numerical properties

Numerical abstraction

Abstraction of P(X→ Z) or in P(X→ F)Transfer functions: assignments, numerical conditions...

A great variety of abstractions available:

b

b

b

b

b

b

bb

b b

domain of intervals

b

b

b

b

b

b

bb

b b

octagons

b

b

b

b

b

b

bb

b b

convex polyhedra

Available as libraries of domains: Apron...Astrée analyzer: verification of highly critical softwares: avionics

◮ up to 1 Million lines C applications◮ modular numerical abstract domain: reduced product of simpler

abstractions, tailored to applications

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 3 / 47

Page 4: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Introduction

Memory abstraction difficulties

Memory abstraction

Abstraction of P( Environment × Stack × Heap )

Operations to analyze: C statementsincluding pointer arithmetic, memory management...

Operations to verify: memory safety, structure preservation...

Difficulties:

1 Complex concrete semantics: memory states, pointers...

2 Huge variety of properties to abstract:

◮ linked structures: lists, trees...◮ arrays◮ structures involving relations (sorted lists, balanced trees) or sharing◮ combination of numerical and memory properties

3 Expensive analysis algorithms to infer such abstractions

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 4 / 47

Page 5: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Introduction

...

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 5 / 47

Page 6: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Introduction

Main contributions

Long term goal

Set up a general purpose library of memory abstract domains

1 Choice of a concrete semantics

should not limit the scope of the analysis

2 Abstraction based on inductive structural properties

expresses a wide range of properties, incl. shape + numerical

3 Static analysis algorithms to infer complex propertiesinstantiations for low level programs and recursive proceduresprototype implementation

Collaboration with Bor-Yuh Evan Chang (U. of Colorado, USA)

Internships of Vincent Laviron, Suzanne Renard and Antoine Toubhans

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 6 / 47

Page 7: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Outline

1 Introduction

2 A parametric abstraction

3 Static analysis algorithms

4 Instantiation to the analysis of low-level programs

5 Instantiation to interprocedural analysis

6 Perspectives and implementation

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 7 / 47

Page 8: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Overview of the abstraction

Memory partitioning into regions

&t 0x...

0x...

24

0x...

22

0x0

64

Graph abstraction:

{

values, addresses −→ nodescells −→ edges

= &t24 42

0x0

32

next

data

next

data

next

data

Region summarization:

= &t24

next

data

list

◮ abstraction parameterized by a set of inductive definitions

Defines a concretization relation

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 8 / 47

Page 9: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Concrete semantics

stack

x

y

z

nx

ny

nz

Heap0x...

n0

0x...

n1

free

free

A very concrete concrete semantics:

division of the heap in blocks

either allocated or free

memory cells have a numeric address

and a numeric size

pointers are numeric values

base + offset

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 9 / 47

Page 10: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Contiguous regions

Shape graphs

Edges: denote memory regions

Nodes: denote values, i.e. addresses or cell contents

Points-to edge, denote contiguous memory regions

Notation: α · f 7→ β

Abstract and concrete views:

α βf

ν(α)

f ν(β)

Concretization:

γS(α · f 7→ β) ={([ν(α) + offset(f) 7→ ν(β)], ν) | ν : {α, β, . . .} → N}

◮ ν: bridge between memory and values

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 10 / 47

Page 11: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Separating conjunction

Shape graphs and separation

Distinct edges denote disjoint heap regions ∗ conjunction

Advantage: allows local reasoning

e.g., easier analysis of updates

Disadvantage: maintaining separation makes some analysisoperations harder to design

We use a field splitting model

i.e., separation impacts edges / fields, not pointers

e.g.,α

β0

β1

f

g stands for stores like:

ν(α)

offset(f)

offset(g)

ν(β1)

ν(β1)

ν(α)

offset(f)

offset(g) ν(β0) = ν(β1)

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 11 / 47

Page 12: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Inductive structures I: need for summarization

Infinitely many concrete states

A global structure invariant: x points to a list

Concrete: all lists pointed to by x

&x 0x0

&x

0x0

&x

0x0

&x

0x0

Abstraction:

x 7→ α ∗ α · list

Graph:

&x

αlist

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 12 / 47

Page 13: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Inductive structures II: inductive definitions in separation

logic

List definition

α · list ::= (emp, α = 0)| (α · next 7→ β0 ∗ α · data 7→ β1 ∗ β0 · list, α 6= 0)

where emp denotes the empty heap

List structures abstracted by α · list

Summarization: a finite formula describes infinitely many heaps

Practical implementation in verification/analysis tools

Verification: hand-written definitions

Analysis: either built-in (Smallfoot, Space Invader) or user-supplied(TVLA, Xisa, [POPL’08]), or partly inferred (Xisa, [POPL’11])

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 13 / 47

Page 14: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Inductive structures III: concretization

Concretization as a least fixpoint

Given an inductive def ι

γS(α · ι) =⋃

{

γS(F ) | α · ιU

−→ F}

Alternate approach:index inductive applications with induction depth

allows to reason on length of structures

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 14 / 47

Page 15: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Inductive structures IV: a few instances

More complex shapes: trees

αtree

U−→ι

α

β0

β1

left

right

tree

tree

Relations among pointers: doubly-linked lists

αdll(δ)

U−→ι α

β

δ

next

prev

dll(α)

Relations between pointers and numerical: sorted lists

αlsort(δ)

U−→ι α

β0

β1

next

data

lsort(β1)

δ ≤ β1

All together: red-black trees with parent pointers [POPL’08]

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 15 / 47

Page 16: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Inductive segments

A frequent pattern

ν(π)

&x

&y

0x0

Could be expressed directly as an inductive with a parameter:α · list_endp(π) ::= (emp, α = π)

| (α · next 7→ β0 ∗ α · data 7→ β1

∗ β0 · list_endp(π), α 6= 0)

This definition would derive from list

Thus, we make segments part of the fundamental predicates of

the domain

&x

&y

list list list

Multi-segments: possible, but harder for analysis

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 16 / 47

Page 17: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

A parametric abstraction

Product abstraction

How to abstract both memory and contents properties ?

We need a sort of a product abstraction

Example: partly unfolded sorted list

β0 β1

next

data

next

data

lsort(β1)

β0

β1

Numerical domain variables: (some) nodes

Cofibered domain structure [Venet, 1996]

◮ a lattice D♯S

of shape graphs

◮ for each shape graph S ∈ D♯S, a distinct numerical

lattice D♯N〈S〉

◮ abstract values are pairs (S ,N) where N ∈ D♯N〈S〉

◮ analysis should use conversion functions

bC

S0

bCS1

bCS2

D♯N〈S0〉

b

b

bb b

D♯N〈S1〉

b

b b

b

D♯N〈S2〉

b

b

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 17 / 47

Page 18: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Outline

1 Introduction

2 A parametric abstraction

3 Static analysis algorithms

4 Instantiation to the analysis of low-level programs

5 Instantiation to interprocedural analysis

6 Perspectives and implementation

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 18 / 47

Page 19: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Static analysis overview

A list insertion function:

list ⋆ l assumed to point to a listlist ⋆ t assumed to point to a list elementlist ⋆ c = l;while(c != NULL && c -> next != NULL && (. . .)){

c = c -> next;}t -> next = c -> next;c -> next = t;

list inductive structure def.

Abstract precondition:

&l

&c

&t

next

data

list

Result of the (interprocedural) analysis

Over-approximations of reachable concrete statese.g., at the loop exit:

&l

&c

&t

next

data

listlist

next

data

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 19 / 47

Page 20: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

The algorithms for static analysis

Unfolding: cases analysis on summariesx y

list list=⇒

x y

list

next

data

list∨

x y

= 0x0list

Abstract postconditions, on “exact” regions, e.g. insertion0x0

x y

list

next

data

list

next

data

=⇒x y

list

next

data

list

next

data

Widening: builds summaries and ensures termination

x y

list list▽

x y

list

next

data

list

=⇒x y

list list

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 20 / 47

Page 21: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Unfolding as a local case analysis

Unfolding principle

Case analysis, based on the inductive definition

Generates symbolic disjunctions

analysis performed in a disjunction domain

Example, for lists:α

list

U−→ α α = 0

αlist

U−→

α α′

β

next

data

listα 6= 0

Numeric predicates: approximated in the numerical domain

Soundness: by definition of the concretization of inductive structures

γS(S) ⊆⋃

{γS(S0) | SU

−→ S0}

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 21 / 47

Page 22: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Local reasoning

Before the assignment c = c → next;:

α

l, c

listα 6= 0

1 Result of the unfolding: two rules to consider◮ empty list does not need be considered

contradiction with num. invariant α 6= 0◮ non-empty list case:

α α′

β

l, cnext

data

list

2 Result of the assignment:

α α′

β

l cnext

data

list

note: sound analysis of the assignment in itself is trivial (frame rule)

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 22 / 47

Page 23: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Unfolding and degenerated cases

assume(l points to a dll)c = l;① while(c 6= NULL && condition)

c = c -> next;② if(c 6= 0 && c -> prev 6= 0)

c = c -> prev → prev;

at ①:α0

dll(δ1)l, c

at ②:α0 α1

dll(δ0) dll(δ1) dll(δ1)l c

⇒ non trivial unfolding

Materialization of c -> prev:α α′ α′′

β′

dll(β) dll(β′)

next

prev

dll(β′)

Segment splitting lemma: basis for segment unfolding

α α′

ι ι′

i + j

describes the same set of stores as α α′′ α′

ι ι′′i

ι′′ ι′

j

Materialization of c -> prev -> prev:α β′

α′ α′′

β′′

dll(β) dll(β′′)

nextnext

prevprev

dll(β′)

Implementation issue: discover which inductive edge to unfoldnon decidable !

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 23 / 47

Page 24: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Widening I: need for a folding operation

Back to the list traversal example... assume(l points to a list)c = l;while(c 6= NULL){c = c → next;

}

First iterates in the loop:◮ at iteration 0 (before entering the loop):

α0

l, c

list

◮ at iteration 1:α0 α1

β1

l cnext

data

list

◮ at iteration 2:α0 α1 α2

β1 β2

l cnext

data

next

data

list

How to guarantee termination of the analysis ?

How to introduce segment edges / perform abstraction ?

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 24 / 47

Page 25: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Widening II: algorithm overview

Takes two abstract values as inputs

x y

list list▽

x

y

list

next

data

list

next

data

Region matching (non unique choice: use of strategies)

x y

list list▽

x

y

list

next

data

list

next

data

Semantic rules for per region weakening

x y

list▽

x

y

list

next

data

=x y

list

Widening:x y

list list

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 25 / 47

Page 26: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Widening III: an example of a widening meta-rule

Segment introduction meta rule (for all ι)

if

α

β0 β1 β0 β1

ι ι

Slft = emp

Srgh ⊑

then Slft▽Srgh = δ0 δ1ι ι

Application to list traversal, at the end of iteration 1:◮ before iteration 0:

α0

l, c

list

◮ end of iteration 0:

β0 β1

β2

l cnext

data

list

◮ join, before iteration 1:

δ0 δ1

l c

list list list

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 26 / 47

Page 27: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Widening IV: properties

Rewrite system properties

A set of sound pair-wise weakening rules

Weakening terminates: widening is computable

Weakening rules are not confluent: need for adequate strategies

otherwise: risk of very imprecise results

Widening properties

Sound: returns an over approximation of concrete join

Terminating:◮ introduction of segments / inductive edges consumes points-to edges◮ after finitely many iterates, the size of graphs decreases

Soundness and termination also hold in the cofibered domain

assumption: use of a widening in the numeric domain

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 27 / 47

Page 28: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Static analysis algorithms

Widening and other folding operators

Canonicalization (TVLA, SmallFoot)

Another kind of widening operator:

Ignores its first argument: unary weakening operatorthus, does not depend on history

Returns values in a finite lattice

Applies not only for fixpoint computation: incremental weakening

Widening (Xisa) exploits the analysis history◮ quick convergence, few disjuncts◮ applies mainly on fixpoint computation

Ideally, a local abstraction would be useful together with historydependeng widening

◮ e.g., to weaken locally abstract elements◮ no requirement to return in a finite lattice◮ still benefit from widening increased precision at loop heads

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 28 / 47

Page 29: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to the analysis of low-level programs

Outline

1 Introduction

2 A parametric abstraction

3 Static analysis algorithms

4 Instantiation to the analysis of low-level programs

5 Instantiation to interprocedural analysis

6 Perspectives and implementation

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 29 / 47

Page 30: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to the analysis of low-level programs

Issues inherent in the analysis of C programs

Toward the analysis of a more complex language

So far, we assumed a simple, Java-like memory model

Yet, we are interested in complex C applications...

An abstract syntax tree type

typedef struct arith {char op;union {

struct{double v; }cststruct{

struct arith ⋆ l;struct arith ⋆ r;

} bin;} n;

} arith;

0x0

0x...

op

n · cst · v

64

112

24

op

n · bin · l

b · bin · r

Nested aggregates

Size of fields to take intoaccount

Existence of pointers to fields

Memory management

Vincent Laviron’s master internship, [ESOP’10]

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 30 / 47

Page 31: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to the analysis of low-level programs

Abstraction of contiguous regions (II)

Points-to edges are of the form α · f 7→ β

How to choose instantiate our framework / choose f ?

typedef struct {int a;struct {

int x;int y;

} b;} tt

0x...

0x...

0x...

a

b · x

b · y

Offsets as sequences

o ::= ǫ null offset| o · f field dereference

int[]

0x...

0x...

0x...

0

4

8

Numeric offsets

o ∈ N numerical value

Only offsets need be modified (instead of field names)

Changes impact points-to edges only

Inductive structures and analysis algorithms are unchangedXavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 31 / 47

Page 32: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to the analysis of low-level programs

Pointer models

How to model precisely and concisely pointers to fields ?

Solution 1 (poor): rely on the numerical domain for offset relations

Our solution: label edges destinations with offsetsν(α)

offset(f)

ν(β)

offset(g)α · f 7→ β · g α β

f gs

Similar techniques used by Kreiker for an analysis based on TVLA

Classification, with Pascal Sotin and Bertrand Jeannet [NSAD’10]

Two independent choices, four models:◮ numeric or symbolic offsets◮ pointers to fields allowed/disallowed

Xisa allows all four models

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 32 / 47

Page 33: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to the analysis of low-level programs

Abstracting multiple views

Analysis of unions/pointer casts in real applications

Distinct access may not invalidate other views in user semantics

How to maintain several views on a memory chunk ?

struct {char c;short s;

} a;

0x..

0x....

a · c

a · s

struct {int i;

} b;

0x........b · i

Mine [LCTES’06], Astrée: analysis maintains dynamic sets of views

Extends to our abstraction: local non separating conjunction

i.e., ∧ applies only to points-to edges

Limited fragment of separation logic with inductive structures

◮ local conjunctions preserveanalysis efficiency

pt pt ∧ ind pt ∧ ind

pt pt pt pt pt

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 33 / 47

Page 34: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to the analysis of low-level programs

Dynamic memory management

How to verify the safety of free(p) ?Only base addresses of allocated regions can be freed safely

Concrete states:allocation

table

x0 n0

x1 n1

Heapx0

n0

x1

n1

free

free

Static analysis:

Abstract the allocation table

Track the location of nodes

representing valid addresses

Track the size and base address ofheap allocated regions

Inductive structures also need tosummarize such properties

Benefit from the very concrete base

semantics

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 34 / 47

Page 35: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Outline

1 Introduction

2 A parametric abstraction

3 Static analysis algorithms

4 Instantiation to the analysis of low-level programs

5 Instantiation to interprocedural analysis

6 Perspectives and implementation

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 35 / 47

Page 36: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Approaches to interprocedural analysis

“relational” approach “inlining” approach

analyze each definition

abstracts P(S̄→ S̄) analyze each call

abstracts P(S)+ modularity - not modular

+ reuse of invariants - re-analysis in 6= contexts- deals with state relations + deals with states

- complex higher order + straightforward iterationiteration strategy

challenge: frame problem challenge: unbounded calls

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 36 / 47

Page 37: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Challenges in interprocedural analysis

void main(){dll ⋆ l ; //assume l points to a slll = fix(l , NULL);

}dll ⋆ fix(dll ⋆ c ,dll ⋆ p){

dll ⋆ ret;if(c != NULL){

c -> prev = p;◮ c -> next = fix(c -> next, c);

if(check(c -> data)){ret = c -> next;remove(c);

⊲ } else ret = c ;}return ret;

}

{

turns a linked list into a doubly linked list

removes some elementsmain l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

pfp

fp

fp

∅ 3

8

11

2

Heap is unbounded, needs abstraction (shape analysis)

But stack may also grow unbounded, needs abstraction

Complex relations between both stack and heap

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 37 / 47

Page 38: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Calling contexts as shape graphs

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

stack heap

Concrete assembly call stack modelled in a separating shapegraph together with the heap

◮ one node per activation record address

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 38 / 47

Page 39: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Calling contexts as shape graphs

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

stack heap

Concrete assembly call stack modelled in a separating shapegraph together with the heap

◮ one node per activation record address

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 38 / 47

Page 40: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Calling contexts as shape graphs

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

stack heap

Concrete assembly call stack modelled in a separating shapegraph together with the heap

◮ one node per activation record address◮ explicit edges for frame pointers

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 38 / 47

Page 41: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Calling contexts as shape graphs

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

stack heap

Concrete assembly call stack modelled in a separating shapegraph together with the heap

◮ one node per activation record address◮ explicit edges for frame pointers◮ local variables turn into activation record fields

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 38 / 47

Page 42: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Calling contexts as shape graphs

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

stack heap

Concrete assembly call stack modelled in a separating shapegraph together with the heap

◮ one node per activation record address◮ explicit edges for frame pointers◮ local variables turn into activation record fields

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 38 / 47

Page 43: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Calling contexts as shape graphs

main l

fix

ret

c

p

?

fix

ret

c

p

?

fix

ret

c

p

?

fp

fp

fp

?

3

8

11

2

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

stack heap

Concrete assembly call stack modelled in a separating shapegraph together with the heap

◮ one node per activation record address◮ explicit edges for frame pointers◮ local variables turn into activation record fields

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 38 / 47

Page 44: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Inference of a call-stack inductive structure

Second and third iterates: a repeating pattern

main

fix

fix

fp

fp

0x0l

p

c

p

c

prev

nextprev

next

list

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

Computing an inductive rule for summarization: subtraction

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 39 / 47

Page 45: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Inference of a call-stack inductive structure

Second and third iterates: a repeating pattern

main

fix

fix

fp

fp

0x0l

p

c

p

c

prev

nextprev

next

list

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

Computing an inductive rule for summarization: subtraction◮ subtract top-most activation record

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 39 / 47

Page 46: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Inference of a call-stack inductive structure

Second and third iterates: a repeating pattern

main

fix

fix

fp

fp

0x0l

p

c

p

c

prev

nextprev

next

list

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

Computing an inductive rule for summarization: subtraction◮ subtract top-most activation record◮ subtract common stack region

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 39 / 47

Page 47: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Inference of a call-stack inductive structure

Second and third iterates: a repeating pattern

main

fix

fix

fp

fp

0x0l

p

c

p

c

prev

nextprev

next

list

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

Computing an inductive rule for summarization: subtraction◮ subtract top-most activation record◮ subtract common stack region◮ gather relations with next activation records: additional parameters◮ collect numerical constraints

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 39 / 47

Page 48: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Inference of a call-stack inductive structure

Second and third iterates: a repeating pattern

main

fix

fix

fp

fp

0x0l

p

c

p

c

prev

nextprev

next

list

main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

Computing an inductive rule for summarization: subtraction

Inferred inductive rule

stk(β1, β2)

fix::ctx

β1 β2

U−→

fix

stk(β0, β1)

ctx

β0

β1

β2

c

pprev

next

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 39 / 47

Page 49: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Instantiation to interprocedural analysis

Inference of a call-stack summary: widening iterates

Fixpoint at function entry:

first iterate:main

fix

fix

fp

fp

0x0l

p

c

p

c

prev

nextprev

next

list

second iterate:main

fix

fix

fix

fp

fp

fp

0x0l

p

c

p

c

p

c

prev

nextprev

nextprev

next

list

widened iterate:main

fix

fix

stk(β2, β3) stk(β0, β1)

fix⋆

0x0

β0

β1

β2

β3

l

c

p

c

p

prev

next

prev

next

list

Fixpoint reached

Fixpoint upon function return:

◮ function return involves unfolding of stack summaries◮ simpler widening sequence: no rule to infer

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 40 / 47

Page 50: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

Outline

1 Introduction

2 A parametric abstraction

3 Static analysis algorithms

4 Instantiation to the analysis of low-level programs

5 Instantiation to interprocedural analysis

6 Perspectives and implementation

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 41 / 47

Page 51: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

Foundations of our analysis

Concrete level choices have a huge impact on the analysisComponents that do no actively contribute to abstraction(summarization)

◮ separation: allows simpler analysis algorithms◮ contiguous regions: very concrete model

High expressive power of inductive structures◮ wide range of structures with or without pointer/numerical relations◮ though, not adapted for everything: arrays, graphs...

(more on this later)

Induction in the abstraction and in the analysis◮ unfolding (aka focus, local abstraction): case analysis◮ widening (aka folding): reverse operation and support for induction

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 42 / 47

Page 52: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

References

Abstraction: [SAS’07], [POPL’08]

Analysis algorithms: [SAS’07], [POPL’08]

C memory model: [ESOP’10], [NSAD’10]

Call stack abstraction: [POPL’10]

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 43 / 47

Page 53: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

Implementation

Xisa prototype (eXtensible Inductive Shape Analyzer)

takes user-supplied inductive definitions [SAS’07,POPL’08,ESOP’10]

infers inductive definitions for call-stack [POPL’11]

Example size (locs) time (ms) peak ∨ iters

list reverse 19 7 1 3list remove element 27 16 4 6list insertion sort 56 21 4 7

dll copy 50 53 2 3dll insert 40 38 2 4binary search tree find 23 10 2 4binary search tree insert 150 83 5 5

distribution over arithmetic trees 41 144 14 2move negations up in arith. trees 120 488 38 2

scull driver 894 9 710 16 4

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 44 / 47

Page 54: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

Infering appropriate inductive definitions

Choice of inductive definitions

User-supplied in Xisa

Ideally: it should be automatic

Solution 1: scan type definitions and assume no sharing◮ e.g., type definition t, with one pointer to type t: assumed list...◮ sound yet imprecise; the analysis will fail in presence of sharing

Solution 2: infer inductive definitions◮ generalize the call-stack inference algorithm◮ should apply well to constructor code, e.g. in object oriented

programs◮ instance of a cofibered abstract domain [Venet, SAS, 1996]

two levels: one for inductive rules, and one for numerical values

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 45 / 47

Page 55: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

Ongoing works and extensions

Decide implication between different, folded inductive predicates

an abstract interpretation algorithm

Internal reduction:convert equivalent abstract predicates

Generalize inductive definitions inference scheme:e.g., for object languages, use constructor code

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 46 / 47

Page 56: Abstract domains for the static analysis of programs manipulating complex data-structuresrosaec.snu.ac.kr/meet/file/20110826.pdf · 2018. 4. 12. · Introduction Main contributions

Perspectives and implementation

Longer term: towards modular heap abstraction

Notion of reduced product (non separating conjunction):Ongoing Master internship by Antoine Toubhans

Splitting combination (separating conjunction)...

1

“high”b

12

“low”

0x0

-

(old msg)

(invalid)

8

“middle”b

-

(old msg)

(invalid)

otherdata-structures

abstraction

contiguous region, sub-store

splitting into two sub-regions

sorted listspecific

abstraction

cells of the form?

(old msg)?

independentabstraction

actual

memorystate

abstractdomains

combination

simple

abstractdomains

MemCAD project: five years contract, Post-Doc positions

Xavier Rival (INRIA) Abstract domains for the static analysis of programs manipulating complex data-structuresAugust, 26th. 2011 47 / 47