efficient sat/bdd-based techniques for predicate abstraction efficient sat/bdd-based techniques for...
Post on 14-Dec-2015
223 Views
Preview:
TRANSCRIPT
Efficient SAT/BDD-based Efficient SAT/BDD-based Techniques for Predicate Techniques for Predicate
AbstractionAbstraction
Efficient SAT/BDD-based Efficient SAT/BDD-based Techniques for Predicate Techniques for Predicate
AbstractionAbstraction
Shuvendu K. Lahiri
Joint work withThomas Ball, Randy Bryant, Byron Cook,
Robert Nieuwenhuis, Albert Oliveras
Microsoft Research, Redmond
– 2 –
Program analysis and abstractionProgram analysis and abstraction
Unbounded state spaceUnbounded state space Unbounded integers, arrays, heap State exploration may not terminate
AbstractionAbstraction Construct an overapproximation of program
behavior Abstract domain/operators ensures that the
analysis terminates
– 3 –
Automatic predicate abstractionAutomatic predicate abstraction
Graf & Saïdi, CAV ’97 Underlying framework
Abstract Interpretation, Cousot & Cousot ‘77
IdeaIdea Given set of predicates P = {P1, …, Pk }
Formulas describing properties of system state Finite Abstraction
Abstraction (ss) = subset of {P1, …, Pk } holds on ssAt most 2k abstract states
– 4 –
Predicate abstraction in practicePredicate abstraction in practice
Boolean Program from C programsBoolean Program from C programs SLAM
Software Model CheckingSoftware Model Checking BLAST, MAGIC, …
Loop invariant synthesis for arrays and listsLoop invariant synthesis for arrays and lists ESC-JAVA,..
Distributed Protocol VerificationDistributed Protocol Verification UCLID, Murphi, …
– 5 –
DefinitionsDefinitions
PredicatesPredicates Literals in some theory T P = {x = 1, x = y, x < y + 2, f(x) = f(y) + 2,..}
FormulaFormula Boolean combination of predicates (x = 1 x < y + 2)
– 6 –
Fundamental Operation: Predicate CoverFundamental Operation: Predicate Cover
PP: : Set of predicatesSet of predicates : : FormulaFormula
FFP P (()) Predicate cover of Weakest expression
over PP that implies
A minterm over A minterm over PPA conjunction of predicates PA conjunction of predicates Pii or or
their negationstheir negations
FFP P (())
Partitioning defined by the predicates
– 7 –
FFP P (())
ExampleExample
Minterms over PMinterms over P x y x 2 x < y x 2 x y x = 2 x < y x = 2
P : P : {x < y, x = 2}{x < y, x = 2}
: y > 1: y > 1
– 8 –
Traditional approachesTraditional approaches
FFP P (())
FFP P (()) Predicate cover of Weakest expression
over PP that implies
Partitioning defined by the predicates
PP: : Set of predicatesSet of predicates : : FormulaFormula
Check which minterms Check which minterms imply imply Use a decision
procedure to check the implication
Exponential number of Decision
Procedure Calls
– 9 –
Traditional approachesTraditional approaches
Large number of decision procedure callsLarge number of decision procedure calls Worst case exponential in PP
Exponential behavior often seen in practice Each decision procedure call can be expensive
Limits scalabilityLimits scalability FP () invoked a few thousand times during a
single software verification run Tools have to sacrifice precision for efficiency
– 10 –
Overview of the talkOverview of the talk
Two approaches to predicate abstractionTwo approaches to predicate abstraction Symbolic Decision Procedures Satisfiability Modulo Theory (SMT) based
Symbolic decision procedures (SDP)Symbolic decision procedures (SDP) [Lahiri, Ball, Cook CAV’05]
SMT-based predicate abstractionSMT-based predicate abstraction Eager [Lahiri, Bryant, Cook CAV’03] DPLL(T) based [Lahiri, Oliveras, Nieuwenhuis
CAV’06]
Challenges aheadChallenges ahead
– 11 –
Predicate Abstraction using Symbolic Decision ProceduresPredicate Abstraction using Symbolic Decision Procedures
– 12 –
Overview of SDPOverview of SDP
Symbolic Decision ProceduresSymbolic Decision Procedures Predicate abstraction
SDP for Equality LogicSDP for Equality Logic
Combining SDP for two theoriesCombining SDP for two theories
– 13 –
Computing FP () Computing FP ()
FFP P distributes over conjunctiondistributes over conjunction FP (1 2) = FP (1) FP (2)
Suffices to compute Suffices to compute FFP P (e(e1 1 ee2 2 ….…. eenn) ) Each ei is a literal First convert to an equivalent conjunctive
normal form (CNF)
Rest of the talk, assume Rest of the talk, assume n = 1 (simplicity)n = 1 (simplicity) Concentrate on computing FP (e)
– 14 –
Decision Procedure (DP)Decision Procedure (DP)
InputInput A set G = {g1,…, gm} of literals
A literal e
OutputOutput Is G e valid?
EquivalentlyEquivalently Is g1 .. gm e UNSAT?
Is G {e} UNSAT?
– 15 –
Symbolic Decision Procedure (SDP)Symbolic Decision Procedure (SDP)
InputInput A set G = {g1,….,gm} of atomic expressions
An atomic expression e
OutputOutput Representation for
{G’ | G’ G, and G’ {e} is UNSAT}
““Symbolic” Decision ProcedureSymbolic” Decision Procedure One run of SDP(G,e) represents an exponential (2|G|)
number of runs of DP(G,e)
– 16 –
Predicate Abstraction and SDPPredicate Abstraction and SDP
PBar = {PBar = {p | p p | p P } P }
SDP(P SDP(P PBar, e)PBar, e) representsrepresents FFP P (e) (e) FP (e)
all minterms over P PBar that imply e SDP(P PBar, e)
{G’ | G’ P PBar , and G’ {e} is UNSAT}
– 17 –
Overview of SDPOverview of SDP
Symbolic Decision ProceduresSymbolic Decision Procedures Predicate abstraction
SDP for Equality LogicSDP for Equality Logic
Combining SDP for two theoriesCombining SDP for two theories
– 18 –
A Decision Procedure for Equality LogicA Decision Procedure for Equality Logic
Atomic expressionsAtomic expressions x = y, x y
Inference Rules (Inference Rules (RR)) Reflexivity, Symmetry,
Transitivity Contradiction
x = y, x y
Inference rule Inference rule generates a new
expression from existing expressions
– 19 –
A Decision Procedure for Equality LogicA Decision Procedure for Equality Logic
Atomic expressionsAtomic expressions x = y, x y
Inference Rules (Inference Rules (RR)) Reflexivity, Symmetry,
Transitivity Contradiction
x = y, x y
Inference rule Inference rule generates a new
expression from existing expressions
a = b b = c a c
a = c a c
G = G = {a=b,b=c}; {a=b,b=c}; e : (a = c)e : (a = c)
– 20 –
A Decision Procedure for Equality LogicA Decision Procedure for Equality Logic
Atomic expressionsAtomic expressions x = y, x y
Inference Rules (Inference Rules (RR)) Reflexivity, Symmetry,
Transitivity Contradiction
x = y, x y
Inference rule Inference rule generates a new
expression from existing expressions
R
R
R
G G { {ee}}
Contains Yes
UNSATSAT
lg(|G|)
– 21 –
Symbolic DP for Equality LogicSymbolic DP for Equality Logic
ModificationsModifications Introduce a
Boolean variable [g] for each expression g in GAdd “true” for e
Construct a “shared” expression for the derivations
a = b b = c a c
[a = b] [b = c] true
a = d d = c
[a = d] [d = c]
G = G = {a=b,b=c,a=d,d=c}; {a=b,b=c,a=d,d=c}; e : (a = c)e : (a = c)
– 22 –
Symbolic DP for Equality LogicSymbolic DP for Equality Logic
a = b b = c a c
a = c a c
[a = b] [b = c] true
a = d d = c
[a = d] [d = c]
G = G = {a=b,b=c,a=d,d=c}; {a=b,b=c,a=d,d=c}; e : (a = c)e : (a = c)
ModificationsModifications Introduce a
Boolean variable [g] for each expression g in GAdd “true” for e
Construct a “shared” expression for the derivations
– 23 –
Symbolic DP for Equality LogicSymbolic DP for Equality Logic
ModificationsModifications Introduce a
Boolean variable [g] for each expression g in GAdd “true” for e
Construct a “shared” expression for the derivations
SDP(G,e)SDP(G,e) The expression
representing “” after lg(|G|) steps
a = b b = c a c
a = c
a c
[a = b] [b = c] true
a = d d = c
[a = d] [d = c]
G = G = {a=b,b=c,a=d,d=c}; {a=b,b=c,a=d,d=c}; e : (a = c)e : (a = c)
– 24 –
Symbolic DP for Equality LogicSymbolic DP for Equality Logic
OutputOutput A shared
Boolean expression with [.] variables in the leaves
a = b b = c a c
a = c
a c
[a = b] [b = c] true
a = d d = c
[a = d] [d = c]
G = G = {a=b,b=c,a=d,d=c}; {a=b,b=c,a=d,d=c}; e : (a = c)e : (a = c)
– 25 –
SDP for Equality LogicSDP for Equality Logic
Expression representing “Expression representing “” after ” after lg(|G|) lg(|G|) stepssteps Shared expression for {G’ | G’ G, and DP(G’,e) is
UNSAT}
Shared expression can be computed in Shared expression can be computed in polynomial time polynomial time Derivations repeated for lg(|G|) steps Each step has at most |V|2 atomic expressions
V: number of vars in G
– 26 –
SDP for other theoriesSDP for other theories
Bounded-depth Bounded-depth Saturating Theory Saturating Theory TT Decision procedure
for T can be implemented by saturation
Provide a function Depth: GG Nat, to denote the max. depth to iterate
R
R
R
G G { {ee}}
Contains Yes
UNSATSAT
Depth(G)
No
– 27 –
SDP for other theoriesSDP for other theories
Equality with Uninterpreted Functions (EUF)Equality with Uninterpreted Functions (EUF) Expressions: f(x) = f(g(y)), x = f(z) Depth(G) < 3m
m is the number of terms in G Polynomial Complexity of SDP
Difference Logic (DIFF)Difference Logic (DIFF) Expressions: x y + c Depth(G) < lg(|G|) Pseudo Polynomial Complexity of SDP
Depends on the size of constants in G
– 28 –
Overview of SDPOverview of SDP
Predicate AbstractionPredicate Abstraction
Symbolic Decision ProceduresSymbolic Decision Procedures Predicate abstraction
SDP for Equality LogicSDP for Equality Logic
Combining SDP for two theoriesCombining SDP for two theories
– 29 –
Combining SDPs for two theoriesCombining SDPs for two theories
Extend Nelson-Oppen method for combining Extend Nelson-Oppen method for combining decision procedures for two theories Tdecision procedures for two theories T11, T, T22 [Nelson, Oppen TOPLAS ’79] The decision procedures communicate via
equalities over shared variables
Given Given SDPSDP11 and and SDPSDP2 2 for theories Tfor theories T11, T, T22
Disjoint signatures, convex theories Each theory generates derivations of all equalities
between variables Complexity of the resultant SDP (for T1T2) only
increases linearly in the number of variables
– 30 –
Combining SDP for two theoriesCombining SDP for two theories
SDP1
SDP2
SDP1
{x=y}
{x=y}
G1
G2
G1
N : number of
sharedvariables
– 31 –
Combining SDP for theoriesCombining SDP for theories
Combined SDP for EUF + DIFFCombined SDP for EUF + DIFF Pseudo Polynomial complexity Important fragment of most program verification
queries (especially in SLAM)
– 32 –
SDP to Predicate AbstractionSDP to Predicate Abstraction
Output of SDP is an Expression DAGOutput of SDP is an Expression DAG Represents FP (e)
Can be used directly to construct Boolean programs (with intermediate variables)
To compute explicit expression for To compute explicit expression for FFP P (e) (e) Construct a Binary Decision Diagram (BDD) from
SDP, and enumerate prime-implicants BDDs crucial for exploiting the shared
representation
– 33 –
EvaluationEvaluation
SLAM benchmarksSLAM benchmarks Generated 665 predicate abstraction queries from
device driver verification Decision Procedure (Zapato) based approach:
27904sec SDP based approach: 273s 100X speedup
– 34 –
ChallengesChallenges
SDP for other interesting theories and SDP for other interesting theories and combinationscombinations Linear arithmetic, non-convex theories
Incremental SDPsIncremental SDPs Useful for combining SDPs
Output sensitive predicate abstraction?Output sensitive predicate abstraction? Complexity is polynomial in the number of
minterms in the output
– 35 –
ConclusionConclusion
Predicate abstraction via symbolic decision Predicate abstraction via symbolic decision proceduresprocedures Polynomial algorithms for useful theories
Modular combination of Symbolic Decision Modular combination of Symbolic Decision Procedures for theoriesProcedures for theories Can design SDP for each theory in isolation
Simple prototype implementationSimple prototype implementation Promising results on SLAM queries
– 36 –
Overview of the talkOverview of the talk
Two approaches to predicate abstractionTwo approaches to predicate abstraction Symbolic Decision Procedures Satisfiability Modulo Theory (SMT) based
Symbolic decision procedures (SDP)Symbolic decision procedures (SDP) [Lahiri, Ball, Cook CAV’05]
SMT-based predicate abstractionSMT-based predicate abstraction Eager [Lahiri, Bryant, Cook CAV’03] DPLL(T) based [Lahiri, Oliveras, Nieuwenhuis
CAV’06]
Challenges aheadChallenges ahead
– 38 –
Satifiability Modulo Theories (SMT)Satifiability Modulo Theories (SMT)
SMTSMT Decide satisfiability of a (ground) first-order
formula with respect to a background theory T Example (EUF)
g(a) = c (f(g(a)) f(c) g(a) = d) c d
SMT-solversSMT-solvers Leverages efficient Boolean search of Boolean
satifiability (SAT) solvers
– 39 –
SMT for predicate abstractionSMT for predicate abstraction
InputInput A formula , a set of predicates P over a theory T
OutputOutput GP (): External predicate cover of Same as FP ()
Main Idea Main Idea [Lahiri et al. CAV’03, Clarke et al. FMSD ’04]
1. Introduce fresh Boolean variables B = {b1,.., bn}
2. Construct the formula (i (bi Pi))
3. Enumerate all the models over B
– 40 –
Eager SMT techniquesEager SMT techniques
MethodologyMethodology Translates a (ground)
formula into equisatisfiable Boolean formula
Use off-the-shelf SAT solvers to check the satisfiability
Tools: UCLID
EquisatisfiableTranslation
((XX, , BB))
bool bool ((AA, , BB))Variables introduced during translationVariables introduced during translation
– 41 –
Predicate abstraction using eager SMT techniquesPredicate abstraction using eager SMT techniques
MethodologyMethodology [Lahiri, Bryant, Cook CAV’03]
Translates a (ground) formula into Boolean formula
Use off-the-shelf BDD or SAT solvers to perform AllSAT over B
Implemented in UCLIDUses SATQE (Kroening)
EquisatisfiableTranslation
(i (bi Pi))
bool bool ((AA, , BB))
EquisatisfiableTranslation
+ Preserves solutions
over Boolean variables
Variables introduced during translationVariables introduced during translation
– 42 –
Advantage over explicit approachAdvantage over explicit approach
Single Call to SAT-based Quantification Engine Single Call to SAT-based Quantification Engine Removes exponential number of calls to theorem
prover
Learning in Incremental SATLearning in Incremental SAT Retains conflict clauses across different solutions
Leverage future advances in SATLeverage future advances in SAT Without any change to the framework
– 43 –
EvaluationEvaluation
Compared with a black-box decision procedure based approach Das, Dill and Park, CAV’99
SLAM benchmarks SLAM benchmarks Device driver verification Eager SMT technique improves 50-100X on many
benchmarks
Distributed protocol verification (UCLID)Distributed protocol verification (UCLID) Lahiri, Bryant VMCAI’04 Decision procedure (SVC/CVC) based approach
unable to finish on most examples > 10,000 theorem prover calls
– 44 –
Lazy SMT techniquesLazy SMT techniques
Integrate a theory T-solver with SAT solverIntegrate a theory T-solver with SAT solver Lazily rule out T-inconsistent Boolean models
using theory solver CVC-Lite, Verifun, MathSAT, Barcelogic,…
Barcelogic ToolBarcelogic Tool R. Nieuwenhuis and A. Oliveras CAV’05 Optimizations (based on DPLL(T))
1. Check partial Boolean models for T-inconsistency2. Upon T-inconsistency, use the explanation as a
conflicting clause and perform backjump3. Theory (unit) propagation to generate implied
facts
– 45 –
Predicate abstraction using lazy methodsPredicate abstraction using lazy methods
Lahiri, Nieuwenhuis, Oliveras CAV’06, using Barcelogic
Enumerate all the models over Enumerate all the models over BB for for
[[ (i (bi Pi)) ]]
whilewhile is T-satisfiableis T-satisfiable dodo
1. M := T-model for using SMT-solver
2. M := project M onto B
3. Consider M as a conflicting clause1. Perform conflict analysis to generate backjump clause
2. Optionally add backjump clause
4. Backjump and continue
return all models over B
– 46 –
Experimental resultsExperimental results
SLAM benchmarksSLAM benchmarks ~5seconds on 665 benchmarks > 100X improvement on SDP based approach
Hardware and protocol benchmarks Hardware and protocol benchmarks [UCLID]
7 set of benchmarks 22X – 143X improvement over Eager-SMT based approach
Linked list verification Linked list verification [Lahiri, Qadeer POPL’06]
4 set of benchmarks 31X – 40X improvement over Eager-SMT based approach
SDP-based technique not applied on the latter two classes Need support for (sound) quantifier-reasoning
– 47 –
Hardware and protocol benchmarksHardware and protocol benchmarks
1.1. Theory propagation crucial for benchmarks with arithmeticTheory propagation crucial for benchmarks with arithmetic E.g. 17X slowdown in OOO without it
2.2. Reusing lemmas and clauses improves 1.5X – 3X on most examplesReusing lemmas and clauses improves 1.5X – 3X on most examples
BenchmarksBenchmarks PredsPreds Eager Eager
(secs)(secs)
Lazy Lazy
(secs)(secs)
# minterms# minterms # cubes# cubes
AodvAodv 2121 657657 4.64.6 29162916 458458
BakeryBakery 3232 245245 1111 426426 294294
BRPBRP 2222 3.53.5 0.10.1 3030 2424
Cache_ibmCache_ibm 1616 3434 1.31.3 326326 123123
Cache_ibm2Cache_ibm2 2626 11191119 2323 22382238 10221022
DlxDlx 2323 335335 1313 3080830808 27042704
OOOOOO 2525 921921 3636 1072810728 242242
# cubes: Number of prime-implicants in the BDD for the minterms
– 48 –
ConclusionsConclusions
Relatively easy to turn SMT solver to perform Relatively easy to turn SMT solver to perform predicate abstractionpredicate abstraction Clear benefit from leveraging learned clause and
not restarting the search after each model
Improvements in SMT translate to predicate Improvements in SMT translate to predicate abstraction caseabstraction case
– 49 –
Overview of the talkOverview of the talk
Two approaches to predicate abstractionTwo approaches to predicate abstraction Symbolic Decision Procedures Satisfiability Modulo Theory (SMT) based
Symbolic decision procedures (SDP)Symbolic decision procedures (SDP) [Lahiri, Ball, Cook CAV’05]
SMT-based predicate abstractionSMT-based predicate abstraction Eager [Lahiri, Bryant, Cook CAV’03] DPLL(T) based [Lahiri, Oliveras, Nieuwenhuis
CAV’06]
Challenges aheadChallenges ahead
– 50 –
SummarySummary
Symbolic decision proceduresSymbolic decision procedures Can construct DAG representation of output in
polynomial time for useful theories Modular combination of SDPs Require more optimizations to make it practical
SMT-based proceduresSMT-based procedures Can leverage SMT solvers without much effort ALLSAT using SAT-solvers (Eager) or SMT
solvers (Lazy) Lazy approaches benefit from tighter SAT+theory
reasoning
– 51 –
Challenges for predicate abstraction toolsChallenges for predicate abstraction tools
Predicate abstraction with non-ground formulasPredicate abstraction with non-ground formulas Quantifiers were removed with simple instantiation
techniques for UCLID/List verification benchmarks
Generate partial models during ALLSATGenerate partial models during ALLSAT Should improve the performace when ratio of #minterms : #
cubes is large
Incremental refinement of approximationsIncremental refinement of approximations Construct refined approximation of FP () from coarser
approximations, without repeating work Some initial directions in CAV’06 paper
Refining the abstraction (incrementally) with Refining the abstraction (incrementally) with monotonically increasing set of predicatesmonotonically increasing set of predicates
– 54 –
OverviewOverview
Predicate AbstractionPredicate Abstraction
Symbolic Decision Procedures (SDP)Symbolic Decision Procedures (SDP) Predicate abstraction
SDP for Equality LogicSDP for Equality Logic
Combining SDP for two theoriesCombining SDP for two theories
Implementation and ResultsImplementation and Results
Related WorkRelated Work
– 55 –
Zap OverviewZap Overview
[Ball, Lahiri, Musuvathi]
Many automated program analysis tools require Many automated program analysis tools require symbolic reasoningsymbolic reasoning e.g. Unit-testing, model checking, static analysis, …
Support symbolic operations for such tools Support symbolic operations for such tools Support richer operations, apart from validity checking Support useful theories for program analysis Leverage advances in SAT solving and theorem proving
Zaptheorem prover
MUTTunit-testing
Zingmodel checking
Boogiestatic analysis
SLAM/SDV
– 56 –
Symbolic Reasoning for Automated Software AnalysisSymbolic Reasoning for Automated Software Analysis Validity / SatisfiabilityValidity / Satisfiability
Model generationModel generation Useful in test case generation
Quantifier eliminationQuantifier elimination Image operation in model checking
Abstract interpretation operationsAbstract interpretation operations abstract transformers, join, widen
InterpolantsInterpolants For abstraction-refinement
– 57 –
Interesting Theories Interesting Theories
Theories Theories Equality with uninterpreted functions (EUF) Linear Arithmetic Arrays Bounded Integers Lists Sets
Combine the symbolic operations for different Combine the symbolic operations for different theoriestheories
– 58 –
Symbolic Reasoning for Automated Software AnalysisSymbolic Reasoning for Automated Software Analysis Validity / SatisfiabilityValidity / Satisfiability
Model generationModel generation Useful in test case generation
Quantifier eliminationQuantifier elimination Image operation in model checking
Abstract interpretation operationsAbstract interpretation operations abstract transformers, join, widen
InterpolantsInterpolants For abstraction-refinement
– 60 –
EvaluationEvaluation
SLAM benchmarksSLAM benchmarks Generated 665 predicate abstraction queries from
device driver verification Decision Procedure based approach: 27904sec SDP based approach: 273s 100X speedup
Synthetic benchmarkSynthetic benchmark Comparison with UCLID More than 100X speedup
– 61 –
Related WorkRelated Work
Decision Procedure BasedDecision Procedure Based Calls a decision procedure to check implication
with each minterm [Das & Dill], [Saidi & Shankar],…
Boolean Quantifier Elimination BasedBoolean Quantifier Elimination Based [Lahiri, Bryant, Cook, CAV 03, Clarke et al., FMSD 04]
Performs predicate abstraction by quantifier elimination
Reduces restricted first-order quantifier elimination to Boolean quantifier elimination
– 62 –
Experimental SetupExperimental Setup
Symbolic MethodSymbolic Method Incremental SAT-based method
SATQE : Simple extension to Zchaff» Built by Daniel Kroening at CMU
Explicit MethodExplicit Method Algorithm of Das, Dill & Park, CAV’99
Avoids exponential worst case in many cases in practiceUses SVC as a decision procedure
Device Driver Benchmarks from SLAM Toolkit Device Driver Benchmarks from SLAM Toolkit Ball and Rajamani, MSR Queries during C Boolean Program construction
– 63 –
Evaluation on SLAM-benchmarksEvaluation on SLAM-benchmarks
BDD based approach worse than SAT on larger BDD based approach worse than SAT on larger benchmarksbenchmarks
ExampleExample ##
PredsPreds
Explicit Explicit SymbolicSymbolic
#Calls#Calls Time Time
(sec)(sec)
#Prop-#Prop-varsvars
SAT-based time SAT-based time (sec)(sec)
Dr.10Dr.10 1919 >7576>7576 >1000>1000 115115 9.99.9
Dr.13Dr.13 2020 >7351>7351 >1000>1000 234234 44.744.7
Dr.15Dr.15 2323 >7237>7237 >1000>1000 336336 68.268.2
Dr.17Dr.17 1515 30413041 507507 105105 6.16.1
Dr.3Dr.3 1313 20232023 355355 125125 7.07.0
top related