what can the sat experience teach us about abstraction? ken mcmillan cadence berkeley labs

49
What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Upload: alvin-sparks

Post on 15-Jan-2016

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

What Can the SAT Experience Teach Us About Abstraction?

Ken McMillan

Cadence Berkeley Labs

Page 2: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Abstraction and SAT• Abstraction is key to applying formal verification to real systems

– Has allowed verification of simple properties of large systems

• In hardware, > 20K registers

• In software, > 100K lines of code

– Extract just enough information from system to prove property

• Exactly how to do this was far from clear at first

• Enabler: advances in Boolean SAT solvers – Exploit the ability of the solver for focus proof on relevant facts

Page 3: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Counterexample guided localization

Model checkabstraction

Choose initial abstraction

ConcretizeCex?

Refineabstraction

true, done

Cex

yes, Cex

no

Kurshan

Apply SAT

Page 4: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Outline• Careful look at modern SAT solvers

– How do they work?

– What general lessons about abstraction can we learn from the experience?

• Survey of current abstraction techniques– How various methods do or do not embody lessons from SAT

• A modest proposal– An attempt to apply the lessons of SAT to software verification

Page 5: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

SAT solvers

DPvariableelimination

DLLbacktracksearch

DPLL SATO, GRASP,CHAFF,etcCombine search and deduction

Page 6: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

DP: the Eager Approach• Eliminate one variable at a time by exhaustive resolution

a _ b

a _ c

: a _ d

: a _ e

: b _ : c

: d

b _ d

b _ e

c _ d

c _ e

: c _ d

: c _ e

d

d _ ee False

drawback: eager approach yields many irrelevant deductions

Page 7: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

DPLL: mixed approach• Combination model search and deduction• Solvers characterized by

– Exhaustive BCP

– Conflict-driven learning (resolution)

– Deduction-based decision heuristics

Page 8: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

DPLL approach

(a b) (b c d) (b d)

c

a

Decisions

b

d

Conflict!

(b c )

resolve

Learned clause

• BCP guides clause learning by resolution• Learning generalizes failures• Learning guides decisions (VSIDS)

Page 9: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Two kinds of deduction

• Closing this loop focuses solver on relevant deductions– Effectiveness of SAT solvers in guiding abstraction

– Very good performance

• What general lessons can we learn from this architecture?

Case Splits

Propagation • case-based• lightweight• exhaustive

Generalization• general• guided

Page 10: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lesson #1: Be Lazy• DP approach

– Eliminate variables by exhaustive resolution

– Extremely eager: deduces all facts about remaining variables

– Essentially quantifier elimination -- explodes.

• DPLL approach– Lazy: only resolves clauses when model search fails

– Resolution use as a form of failure generalization

• Learns general facts from model search failure

Implications:1. Make expensive deductions only when their relevance can be

justified. 2. Don't do quantifier elimination.

Page 11: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lesson #2: Be Eager• In a DPLL solver, we always close deduction under unit resolution (BCP) before

making a decision. – Guides decision making model search– Guides resolution steps in failure generalization– BCP updated after decision making and clause learning

Implications:1. Be eager with inexpensive deduction. 2. Deduce all the cheap facts before trying any expensive ones.3. Let the cheap deduction guide the expensive deduction

Page 12: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lesson #3: Learn from the Past• Facts useful in one particular case are likely to be useful in other cases.• This principle is embodied in

– Clause learning

– Deduction-based decision heuristics (e.g., VSIDS)

Implication: Deduce facts that have been useful in the past.

Page 13: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Current abstraction methods• Focus on software model checking• Do these methods embody the SAT lessons?

Page 14: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Static Analysis• Compute the least fixed-point of an abstract transformer

– This is the strongest inductive invariant the analysis can provide

• Inexpensive analyses:– value set analysis

– affine equalities, etc.

• These analyses lose information at a merge:

x = y x = z

T

Be eager with inexpensive deductions

Be lazy with expensive deductions X

Learn from the past N/A

Page 15: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Predicate abstraction• Abstract transformer:

– strongest Boolean postcondition over given predicates

• Advantage: does not lose information at a merge– join is disjunction

x = y x = z

x=y _ x=z• Disadvantage:

– Abstract post is very expensive!

– Computes information about predicates with no relevance justification

Be eager with inexpensive deductions X

Be lazy with expensive deductions X

Learn from the past N/A

Page 16: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

PA with CEGAR loop

Model checkabstraction T#

Choose initial T#

Can extend Cexfrom T# to T?

Add predicatesto T#

true, done

Cex

yes, Cex

no

• Choose predicates to refute cex's– Generalizes failures

– Some relevance justification

• Still performs expensive deduction without justification– strongest Boolean postcondition

• Fails to learn from past– Start fresh each iteration

– Forgets expensive deductions

Be eager with inexpensive deductions X

Be lazy with expensive deductions X+

Learn from the past X

Page 17: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Boolean Programs• Abstract transformer

– Weaker than predicate abstraction

– Evaluates predicates independently -- loses correlations

{T} x=y; {x=0 , y=0}

Predicate abstraction

{T} x=y; {T}

Boolean programs

• Advantages– Computes less expensive information eagerly

– Disadvantages

– Still computes expensive information without justification

– Still uses CEGAR loop

Be eager with inexpensive deductions X

Be lazy with expensive deductions X++

Learn from the past X

Page 18: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lazy Predicate Abstraction• Unwind the program CFG into a tree

– Refine paths as needed to refute errors

ERR!

x=y

x=y

y=0

Add predicates along pathto allow refutation of error

• Refinement is local to an error path• Search continues after refinement

– Do not start fresh -- no big CEGAR loop

• Previously useful predicates applied to new vertices

Page 19: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lazy Predicate Abstraction

ERR!

x=y

x=y

y=0

Add predicates along pathto allow refutation of error

• Refinement is local to an error path• Search continues after refinement

– Do not start fresh -- no big CEGAR loop

• Previously useful predicates applied to new vertices

Be eager with inexpensive deductions X

Be lazy with expensive deductions -

Learn from the past

Page 20: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

SAT-based BMC

• Inherits all the properties of SAT• Deduction limited to propositional logic

– Cannot directly infer facts like x · y– Inexpensive deduction limited to BCP

ProgramLoop

UnwindingConvert toBit Level

SAT

Be eager with inexpensive deductions --

Be lazy with expensive deductions

Learn from the past

Page 21: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

SAT-based with Static Analysis

• Allows richer class of inexpensive deductions

• Inexpensive deductions not updated after decisions and clause learning– Coupling could be tighter

– Perhaps using lazy decision procedures?

ProgramLoop

UnwindingConvert toBit Level

SATStatic

Analysis

x=y; x=z;

x=z

decision

Be eager with inexpensive deductions -

Be lazy with expensive deductions

Learn from the past

Page 22: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lazy abstraction and interpolants• A way to apply the lessons of SAT to lazy abstraction• Keep the advantages of lazy abstraction...

– Local refinement (be lazy)

– No "big loop" as in CEGAR (learn from the past)

• ...while avoiding the disadvantages of predicate abstraction...– no eager image computation

• ...and propagating inexpensive deductions eagerly– as in static analysis

Page 23: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Interpolation Lemma• Notation: L() is the set of FO formulas over the symbols of • If A B = false, there exists an interpolant A' for (A,B) such that:

A A'A' ^ B = falseA' 2 L(A) \ L(B)

• Example: – A = p q, B = q r, A' = q

• Interpolants from proofs– in certain quantifier-free theories, we can obtain an interpolant for a

pair A,B from a refutation in linear time. – in particular, we can have linear arithmetic, uninterpreted functions,

and restricted use of arrays

(Craig,57)

Page 24: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Interpolants for sequences• Let A1...An be a sequence of formulas

• A sequence A’0...A’n is an interpolant for A1...An when

– A’0 = True

– A’i-1 ^ Ai ) A’i, for i = 1..n

– An = False

– and finally, A’i 2 L (A1...Ai) \ L(Ai+1...An)

A1 A2 A3 Ak...

A'1 A'2 A'3 A'k-1...True False) ) ) )

In other words, the interpolant is a structured

refutation of A1...An

Page 25: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Interpolants as Floyd-Hoare proofs

False

x1=y0

True

y1>x1

))

)

1. Each formula implies the next

2. Each is over common symbols of prefix and suffix

3. Begins with true, ends with false

Path refinement procedure

SSAsequence Prover

Interpolation

PathRefinement

proof structuredproof

x=y;

y++;

[x=y]

x1= y0

y1=y0+1

x1y1

Page 26: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Lazy abstraction -- an example

do{ lock(); old = new; if(*){ unlock; new++; }} while (new != old);

program fragment

L=0

L=1; old=new

[L!=0]

L=0; new++

[new==old]

[new!=old]

control-flow graph

Page 27: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

1

L=0

T2

[L!=0]T

Unwinding the CFG

L=0

L=1; old=new

[L!=0]

L=0; new++

[new==old]

[new!=old]

control-flow graph

0T

F L=0

Label error state with false, by refining labels on path

Page 28: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

6[L!=0]T

5

[new!=old]

T

4

L=0; new++

T

3

L=1;old=new

T

Unwinding the CFG

L=0

L=1; old=new

[L!=0]

L=0; new++

[new==old]

[new!=old]

control-flow graph

0

12

L=0

[L!=0]F L=0

F L=0

L=0

T

Covering: state 5 is subsumed bystate 1.

Page 29: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

T

11[L!=0]

T

10

[new!=old]

T

8

T

Unwinding the CFG

L=0

L=1; old=new

[L!=0]

L=0; new++

[new==old]

[new!=old]

control-flow graph

0

12

3

4

5

L=0

L=1;old=new

[L!=0]

L=0; new++

[new!=old]

F L=0

6[L!=0]F L=0

L=0

7

[new==old]

T

old=new

F

old=new

F

T

Another cover. Unwinding is now complete.

9T

Page 30: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Covering step• If (x) ) (y)...

– add covering arc x B y

– remove all z B w for w descendant of y

x· y x=y

X

We restict covers to be descending in a suitable total order on vertices.This prevents covering from diverging.

Page 31: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Refinement step• Label an error vertex False by refining the path to that vertex with an interpolant for that path.

• By refining with interpolants, we avoid predicate image computation.

T

T

TT

T

T

T

x = 0

[x=y] [xy]

y++

[y=0]

y=2

x=0

y=0

y0

F

X

Refinement may remove covers

Page 32: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Forced cover• Try to refine a sub-path to force a cover

– show that path from nearest common ancestor of x,y proves (x) at y

T

T

TT

T

T

T

x = 0

[x=y] [xy]

y++

[y=0]

y=2

x=0

y=0

y0

F

refine this path

y0

Forced cover allow us to efficiently handle nested control structureand is analogous to non-cronological backtracking

Page 33: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

T

[x=z] [xz]

y=1 y=2

y2{1,2}

[y=1 ^ xz]

Incremental static analysis• Update static analysis of unwinding incrementally

– Static analysis can prevent many interpolant-based refinements

– Interpolant-based refinements can refine static analysis

T

T

TT

T

T

T

x = 0

[x=y] [xy]

y++

[y=0]

y=2

x=0

y=0

y0

F

y=2

from valueset analysis

x=z

x=z

F

refine thispath

y=2

value setrefined

Page 34: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Two kinds of deduction

Case Splits

Propagation

• case-based• lightweight• exhaustive

Generalization

• general• guided by search

= path splits

= static analysis= interpolation

Page 35: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Applying the lessons from SAT• Be lazy with expensive deductions

– All path refinements justified

– No eager predicate image computation

• Be eager with inexpensive deductions– Static analysis updated after all changes

– Refinement and static analysis interact

• Learn from the past– Refinements incremental – no “big CEGAR loop”

– Re-use of historically useful facts by forced covering

Page 36: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Windows driver benchmarks• Windows device driver benchmarks from BLAST benchmark suite• Compare

– BLAST, a lazy predicate abstraction-base mode checker – IMPACT, using lazy interpolation-based abstraction.

Almost all BLAST time spent in predicate image operation.

name source

LOC

flat

LOC

BLAST

(s)

IMPACT

(s)

BLAST

IMPACT

kbfiltr 12K 2.3K 11.9 0.35 34

diskperf 14K 3.9K 117 2.37 49

cdaudio 44K 6.3K 202 1.51 134

floppy 18K 8.7K 164 4.09 41

parclass 138K 8.8K 463 3.84 121

parport 61K 13K 324 6.47 50

Page 37: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Performance, Impact v. Blast

0.01

0.1

1

10

100

1000

0.01 0.1 1 10 100 1000

Blast runtime (s)

Impa

ct r

untim

e (s

)

Page 38: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Effect of static analysis

0.01

0.1

1

10

100

1000

0.01 0.1 1 10 100 1000

Impact without static analysis (s)

Impa

ct r

untim

e (s

)

Page 39: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Comparing Blast and Impact• Similarities

– Both based on lazy abstraction

– Both use the same interpolating prover

• Differences– Interpolants instead of predicate abstraction

• Avoids eager deduction

– Re-use of facts by “forced covering”

• Analogous to non-cronological backtracking

– Static analysis

• Interacts with interpolants

Fundamentally similar model checkers, but as in SAT,implementation details can make orders-of-magnitude difference.

Page 40: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Conclusions• SAT solvers provide a paradigm for efficient abstraction

– Interaction of case splitting, propagation and generalization

• Current abstraction methods do not fully apply this paradigm– Mostly indirectly, through use of SAT

– (Same can be said of SMT solvers)

• To apply the SAT lessons in a given application, look for– Useful case splits (e.g., program paths)

– Inexpensive propagation (e.g., value set analysis)

– Guided generalization (e.g., by interpolation)

• Also, avoid over-eager deduction and the “big abstraction loop”

Result can be order-of-magnitude improvement, as in modern SAT solvers.

Page 41: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Conclusions• Caveats

– Comparing different implementations is dangerous

– More and better software model checking benchmarks are needed

• Tentative conclusions– For control-dominated codes, predicate abstraction is too "eager“

• better to be more lazy about expensive deductions

– Propagate inexpensive deductions can produce substantial speedup

• roughly one order of magnitude for Windows examples

– Perhaps by applying the lessons of SAT, we can obtain the same kind of rapid performance improvements obtained in that area

• Note 2-3 orders of magnitude speedup in lazy model checking in 6 months!

Page 42: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Future work• Procedure summaries

– Many similar subgraphs in unwinding due to procedure expansions

– Cannot handle recursion

– Can we use interpolants to compute approximate procedure summaries?

• Quantified interpolants– Can be used to generate program invariants with quantifiers

– Works for simple examples, but need to prevent number of quantifiers from increasing without bound

• Richer theories– In this work, all program variables modeled by integers

– Need an interpolating prover for bit vector theory

• Concurrency...

Page 43: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Unwinding the CFG• An unwinding is a tree with an embedding in the CFG

L=0

L=1; old=new

[L!=0]

L=0; new++

[new==old]

[new!=old]

8

0

12

3

4

L=0

L=1;old=new

[L!=0]

L=0; new++

Mv

Me

Page 44: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Expansion• Every non-leaf vertex of the unwinding must be fully expanded...

L=00

1

L=0

Mv

Me

If this is not a leaf...

...and this exists... ...then this exists.

...but we allow unexpanded leaves (i.e., we are building afinite prefix of the infinite unwinding)

Page 45: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Labeled unwinding• A labeled unwinding is equiped with...

– a lableing function : V ! L(S)

– a covering relation B µ V £ V

0

12

3

4

5

L=0

L=1;old=new

[L!=0]

L=0; new++

[new!=old]

6[L!=0]

7

[new==old]

T

F L=0

F L=0

L=0

T

T

These two nodes are covered.

(have a ancestor at the tail of a covering arc)

...

...

Page 46: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Well-labeled unwinding• An unwinding is well-labeled when...

– () = True

– every edge is a valid Hoare triple

– if x B y then y not covered

0

12

3

4

5

L=0

L=1;old=new

[L!=0]

L=0; new++

[new!=old]

6[L!=0]

7

[new==old]

T

F L=0

F L=0

L=0

T

T

Page 47: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Safe and complete• An unwinding is

– safe if every error vertex is labeled False– complete if every nonterminal leaf is covered

T

10[L!=0]

T

9

[new!=old]

T

8

T

0

12

3

4

5

L=0

L=1;old=new

[L!=0]

L=0; new++

[new!=old]

F L=0

6[L!=0]F L=0

L=0

7

[new==old]

T

old=new

F

old=new

F

T

... ...

Theorem: A CFG with a safe complete unwinding is safe.

9T

Page 48: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Unwinding steps• Three basic operations:

– Expand a nonterminal leaf

– Cover: add a covering arc

– Refine: strengthen labels along a path so error vertex labeled False

Page 49: What Can the SAT Experience Teach Us About Abstraction? Ken McMillan Cadence Berkeley Labs

Overall algorithm1. Do as much covering as possible

2. If a leaf can't be covered, try forced covering

3. If the leaf still can't be covered, expand it

4. Label all error states False by refining with an interpolant

5. Continue until unwinding is safe and complete