applications of logic in software engineering · 2020. 5. 21. · how do you check whether your...

Applications of Logic in Software Engineering

CS402, Spring 2016 Shin Yoo

Acknowledgements• I borrow slides from:

• Moonzoo Kim

• Theo C. Ruys (http://spinroot.com/spin/Doc/SpinTutorial.pdf)

• CBMC & Daniel Kroening (http://www.cprover.org/cbmc/doc/cbmc-slides.pdf)

http://spinroot.com/spin/Doc/SpinTutorial.pdf

http://www.cprover.org/cbmc/doc/cbmc-slides.pdf

What are computers good at?

• Logical arguments?

• Fast computation?

What is the battlefront in AI?

• Prolog?

• Big data + machine learning?

(a rhetorical) Question

• You are given a very long, very complex formula in propositional logic. You have to show its validity. How do you proceed?

• Preprocess(?) the formula as much as possible to make it simpler; try proof calculus.

• Start constructing the truth table.

How do you check whether your program is correct?

• Prove its correctness. That is, the program has to be correct (with respect to a set of specifications). This is called verification.

• Check its behaviour as thoroughly as possible. That is, execute the program with as many inputs as possible, and check that the behaviour conforms to the expectation. This is called validation (also, more commonly, testing).

• Check its behaviour with the input you had in mind. That is, execute the program with the given example input, and check that it does not crash. This is called.. umm…

Solving Various Problems using SAT Solver 2

Sudoku Puzzle

Verify/Testing C Programs

SAT Solver

Latin Square

Problem

Traveling Salesmen Probelm

Optimal Path

Planning

CNF SAT Formula

Encoding 1

Encoding 2

Encoding 3

Encoding n

Encoding

Moonzoo Kim, CS402, Spring 2013

Operational Semantics of Software • A system execution V is

a sequence of states s0s1… – A state has an

environment Us:Var-> Val

• A system has its semantics as a set of system executions

3

x:0,y:0

x:0,y:1

x:1,y:2

x:1,y:3

x:2,y:4

s0

s1

s2

s3

s4

x:5,y:1

x:5,y:2

x:5,y:3

x:5,y:4

s11

s12

s13

s14

x:7,y:3

x:7,y:4

s21

s22

Moonzoo Kim, CS402, Spring 2013

Theo C. Ruys - SPIN Beginners' Tutorial version: Friday, 13 September 2002

SPIN 2002 Workshop, Grenoble, 11-13 April 2002 3

Thursday 11-Apr-2002 Theo C. Ruys - SPIN Beginners' Tutorial 5

What is Model Checking?• [Clarke & Emerson 1981]:

“Model checking is an automated technique that, given a finite-state model of a system and a logical property, systematically checks whether this property holds for (a given initial state in) that model.”

φ=|M

Although finite-state, the model of a system typically

grows exponentially.

• Model checking tools automatically verify whether

holds, where M is a (finite-state) model of a system and property φ is stated in some formal notation.

• Problem: state space explosion!• SPIN [Holzmann 1991] is one of

the most powerful model checkers.Based on [Vardi & Wolper 1986].


System Development

SystemEngineering

Analysis

Design

Code

Testing

Maintenance

“Modern”Model Checking

“Classic”Model Checking

Classic “waterfall model”[Pressman 1996]

Theo C. Ruys (http://spinroot.com/spin/Doc/SpinTutorial.pdf)

http://spinroot.com/spin/Doc/SpinTutorial.pdf

Pros and Cons of Model Checking • Pros

– Fully automated and provide complete coverage – Concrete counter examples – Full control over every detail of system behavior

• Highly effective for analyzing – embedded software – multi-threaded systems

• Cons – State explosion problem – An abstracted model may not fully reflect a real

system – Needs to use a specialized modeling language

• Modeling languages are similar to programming languages, but simpler and clearer

5

Example of Model Checking thread A() { unsigned char x; again: x++; goto again; }

4

x:0

x:1

x:2

x:255

thread A() { unsigned char x; again: x++; goto again; } thread B() { unsigned char y; again: y++; goto again; }

x:0,y:0

x:1,y:0

x:2,y:0

x:255,y:0

x:0,y:1

x:1,y:1

x:0,y:255

x:1,y:255

x:255,y:255

x:2,y:1 x:2,y:255

But wait… what?

• CS402 dealt with greek alphabets, not C code. How do we reason about arbitrary programming languages using what we have learnt so far?

• My answer: but you also did not expect to solve Nonogram in CS402 - that has nothing to do with greek alphabets either :)

Example. Sort (1/2)

• Suppose that we have an array of 4 elements each of which is 1 byte long – unsigned char a[4];

• We wants to verify sort.c works correctly – main() { sort(); assert(a[0]<= a[1]<= a[2]<=a[3]);}

• Hash table based explicit model checker (ex. Spin) generates at least 232 (= 4x109 = 4G) states

• 4G states x 4 bytes = 16 Gbytes, no way…

• Binary Decision Diagram (BDD) based symbolic model checker (ex. NuSMV) takes 200 MB in 400 sec

8/23

9 14 2 255

Example. Sort (2/2)

9/23

UNSAT VSIDS

Conflicts 35067

Decisions 161406

Time(sec) 1.89

1. #include <stdio.h> 2. #define N 5 3. int main(){ 4. int data[N], i, j, tmp; 5. /* Assign random values to the array*/ 6. for (i=0; i<N; i++){ 7. data[i] = nondet_int(); 8. } 9. /* It misses the last element, i.e., data[N-1]*/ 10. for (i=0; i<N-1; i++) 11. for ( j=i+1; j<N-1; j++) 12. if (data[i] > data[ j]){ 13. tmp = data[i]; 14. data[i] = data[ j]; 15. data[ j] = tmp; 16. } 17. /* Check the array is sorted */ 18. for (i=0; i<N-1; i++){ 19. assert(data[i] <= data[i+1]); 20. } 21. }

•SAT-based Bounded Model Checker •Total 19099 CNF clause with 6224 boolean propositional variables •Theoretically, 26224 choices should be evaluated!!!

SAT VSIDS

Conflicts 73

Decisions 2435

Time(sec) 0.015

Overview of SAT-based Bounded Model Checking

10/23

Requirements C Program

Formal Requirement Properties

(F W)

Model Checker

↓ Abstract Model

↓

Okay

Satisfied Not satisfied

Counter example

Requirements

C Program Formal Requirement Properties in C (ex. assert( x < a[i]); )

Translation to SAT formula

↓

Okay

Satisfied Not satisfied

Counter example

SAT Solver

Software Model Checking as a SAT problem (1/4)

• Control-flow simplification – All side effect are removed

• i++ => i=i+1;

– Control flow is made explicit • continue, break => goto

– Loop simplification • for(;;), do {…} while() => while()

11/23

Software Model Checking as a SAT problem (2/4)

• Unwinding Loop

12/23

x=0; while(x < 2){ y=y+x; x++; }

x=0; if (x < 2) { y=y+x; x++; } if (x < 2) { y=y+x; x++; } if (x < 2) { y=y+x; x++; } /*Unwinding assertion*/ assert (! (x < 2))

Original code

Unwinding the loop 3 times

x=0; if (x < 2) { y=y+x; x++; } /* Unwinding assertion */ assert(!(x < 2))

Unwinding the loop 1 times

Scalability of Path Search

Let’s consider the following CFG:

L1

L2 L3

L4

This is a loop with an if inside.

Q: how many paths for n iterations?

CBMC: Bounded Model Checking for ANSI-C – http://www.cprover.org/ 14

Bounded Model Checking

I Bounded Model Checking (BMC) is the most successfulformal validation technique in the hardware industry

I Advantages:4 Fully automatic4 Robust4 Lots of subtle bugs found

I Idea: only look for bugs up to specific depth

I Good for many applications, e.g., embedded systems


Model Checking as a SAT problem (3/4)

• From C Code to SAT Formula

14/23

x=x+y; if (x!=1) x=2; else x++; assert(x<=3);

x1=x0+y0; if (x1!=1) x2=2; else x3=x1+1; x4=(x1!=1)?x2:x3; assert(x4<=3);

C { x1=x0+y0 � x2=2 � x3=x1+1 �(x1!=1 � x4=x2 � x1=1 � x4=x3) P { x4 <= 3

Check if C � � P is satisfiable, if it is then the assertion is violated C � � P is converted to Boolean logic using a bit vector representation for the integer variables y0,x0,x1,x2,x3,x4

Original code Convert to static single assignment (SSA)

Generate constraints

Model Checking as a SAT problem (4/4)

15/23

Assume that x,y,z are three bits positive integers represented by propositions x0x1x2, y0y1y2, z0z1z2 C { z=x+y { (z0$(x0©y0)©( (x1Æy1) Ç (((x1©y1)Æ(x2Æy2))) Æ (z1$(x1©y1)©(x2Æy2)) Æ (z2$(x2©y2))

•Example of arithmetic encoding into pure propositional formula

Eventually, everything is Boolean.

Theo C. Ruys - SPIN Beginners' Tutorial version: Friday, 13 September 2002

SPIN 2002 Workshop, Grenoble, 11-13 April 2002 4


“Classic” Model Checking

ModelChecker

AbstractVerification Model

(initial) Design

Implementation

(manual)abstractions

refinementtechniques


“Modern” Model Checking

• Abstraction is the key activity in both approaches.

• This talk deals with pure SPIN, i.e., the “classic” model checking approach.

ModelChecker

systematicabstractiontechniques

Implementation

Verification Model

To cope with the state space explosion.

But that looks extremely painful to do manually every time we want to prove something, doesn’t it?

SMT: Satisfiability Modulo Theories

• SMT problem: decision problem for logical formulas with respect to background theories in classical first-order-logic.

• SMT instance is a predicate logic formula; the aim is to determine whether such a formula is satisfiable. The underlying problem is still one of Boolean Satisfiability problem.

SMT: The Eager Approach

• Immediately encode the first order formulas into Boolean SAT, and invoke SAT solvers.

• Can rely on advances in Boolean SAT solvers

• However, loss of high-level semantics means sometimes it struggles with obvious statements, such as x + y = y + x.

Enabling Technology: SAT

1960 1970 1980 1990 2000 2010

1,000,000

100,000

10,000

1,000

100

10

number of variables of a typical, practical SAT instancethat can be solved by the best solvers in that decade


SMT: The Lazy Approach• Davis-Putnam-Logemann-Loveland algorithm

(DPLL) (1962) is a backtracking-based search algorithm that is used to determine satisfiability of propositional logic formulas in CNF form (i.e. CNF-SAT).

• Theory solver is concerned with feasibility of conjunctions of theory-specific predicates, infers new facts from known facts, and interacts with the SAT solver with respect to propagation and backtracking.

Concolic Testing

• Suppose we can solve satisfiability problems, extended with theories about integers, lists, arrays, etc. What can we use it for in order to check our programs?

void testme(int[] a){if(a == null) return;if(a.length > 0){if(a[0] == 42)throw new Exception(“bug”);

}}

Constraints to Solve Data Observed Path Condition

No more path!

a!=null && a.length > 0 && a[0] != 42

a!=null && !(a.length > 0)

a==nullnull

{}

{0}

{42}

a!=null

a!=null && a.length > 0

a!=null && a.length > 0 && a[0] == 42

Execute

a==nulltruefalse

`Negate last condition and choose another path

a.length > 0truefalse

a[0] == 42truefalse

Solve

11/42

Example typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) Error(); return 0; }

• Random Test Driver:

• random memory graph reachable from p

• random value for x

• Probability of reaching Error( ) is extremely low

Example from the slides “CUTE: A Concolic Unit Testing Engine for C” by K.Sen 2005

12/42

Concolic Testing typedef struct cell { int v; struct cell *next; } cell; int f(int v) { return 2*v + 1; } int testme(cell *p, int x) { if (x > 0) if (p != NULL) if (f(x) == p->v) if (p->next == p) Error(); return 0; }

Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

p=p0, x=x0

p , x=236 NULL

13/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

p=p0, x=x0

p , x=236 NULL

14/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p=p0, x=x0

p , x=236 NULL

15/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p=p0, x=x0

p , x=236 NULL

!(p0!=NULL)

16/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p=p0, x=x0

p , x=236 NULL

p0=NULL

solve: x0>0 and p0≠NULL

17/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p=p0, x=x0

p , x=236 NULL

p0=NULL

solve: x0>0 and p0≠NULL x0=236, p0 634

NULL

18/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

p=p0, x=x0, p->v =v0, p->next=n0

p , x=236

NULL

634

19/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=236

NULL

634

x0>0

20/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=236

NULL

634

x0>0

p0≠NULL

21/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=236

NULL

634

x0>0

p0≠NULL

2x0+1≠v0

22/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=236

NULL

634

x0>0

p0≠NULL

2x0+1≠v0

23/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=236

NULL

634

x0>0

p0≠NULL

2x0+1≠v0

solve: x0>0 and p0≠NULL and 2x0+1=v0

24/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=236

NULL

634

x0>0

p0≠NULL

2x0+1≠v0

solve: x0>0 and p0≠NULL and 2x0+1=v0 x0=1, p0 3

NULL

25/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

NULL

3

26/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

NULL

3

x0>0

27/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

NULL

3

x0>0

p0≠NULL

28/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

NULL

3

x0>0

p0≠NULL

2x0+1=v0

29/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

NULL

3

x0>0

p0≠NULL

2x0+1=v0

n0≠p0

30/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

NULL

3

x0>0

p0≠NULL

2x0+1=v0

n0≠p0

31/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

solve: x0>0 and p0≠NULL and 2x0+1=v0 and n0=p0


p , x=1

NULL

3

x0>0

p0≠NULL

2x0+1=v0

n0≠p0

32/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

solve: x0>0 and p0≠NULL and 2x0+1=v0 and n0=p0 x0=1, p0 3


p , x=1

NULL

3

x0>0

p0≠NULL

2x0+1=v0

n0≠p0

33/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints


p , x=1

3

34/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0 p=p0, x=x0, p->v =v0, p->next=n0

p , x=1

3

35/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p0≠NULL p=p0, x=x0, p->v =v0, p->next=n0

p , x=1

3

36/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p0≠NULL

2x0+1=v0


p , x=1

3

37/42


Concrete Execution

Symbolic Execution

concrete state

symbolic state

constraints

x0>0

p0≠NULL

2x0+1=v0

n0=p0


p , x=1

3

Error() reached

Limitations• Unlike model checking, we operate directly on top of

program source code: no abstraction.

• There may exist formulas that SMT solvers cannot handle: for example, if( sin(x) + cos(x) == 0.3) { error(); }

• Some limitations on complex pointer and array operations.

• What if the aim of testing is not about logical correctness, such as execution time or memory usage?

Relative Strength

• Model checking: specify a property that needs to be checked, prove that either it is not violated, or there exists a counterexample

• Concolic testing: IF there is an explicit condition that you can check (e.g. assertions or exeptions), can try to reach them. Otherwise, produces a test input.

Teacher: add all numbers from 1 to 100!

Young Gauss: 1 + 2 + ... + 50 + 100 + 99 + … + 51

= 101 + 101 + … + 101 = 101 * 50 = 5050

Metaheuristic: Is it 1000? Teacher: No. Metaheuristic: Is it 1001? Teacher: No. …. (after a while) Metaheuristic: Is it 5050? Teacher: Yes!

Metaheuristic• Essentially smart trial and error

• Tries a solution

• Get feedback on how good it was

• Move the solution towards a “better” direction

• Repeat until problem is solved

Test Data Generation• Fitness function (for branch

coverage) = [approximation level] + normalise([branch distance])

• We want to execute a specific branch, but the current input value does not follow the required path. Then:

• Approximation level: number of nesting levels between current path and our target branch

• Branch distance: distance in the current predicate between desired and current status

?

가고 싶은 branch

현재 입력값이 실행하는 path

ApproximationLevel = 2

Branch Distance

Branch Distance• Wait, predicates are Boolean. What do you mean,

distance?

• To satisfy x == y, convert it to b = |x - y| and minimise b: when it becomes 0, x becomes equal to y.

• To satisfy y >= x, convert it to b = x - y + K and minimise b: when it becomes 0, y is greater than x by K.

• Normalise: bnorm = 1 - 1.001^(-b)

Predicate f minimise until..

a > b b - a + K f < 0

a >= b b - a + K f <= 0

a < b a - b + K f < 0

a <= b a - b + K f <= 0

a == b |a - b| f == 0

a != b -|a - b| f < 0

B. Korel, “Automated software test data generation,” IEEE Trans. Softw. Eng., vol. 16, pp. 870–879, August 1990.

Branch Distance

if(c >= 4)

if(c <= 10)

if(a == b)

target

Test input (a, b, c), K = 1

(11, 2, 1)Falseapp. lvl = 2

b. dist = 4 - c +1 f = 2 + (1 - 1.001^-4) = 2.004

False

True

False

True

True

(11, 2, 11)app. lvl = 1

b. dist = c - 10 + 1 f = 1 + (1 - 1.001^-2) = 1.001

(11, 2, 9)app. lvl =0

b. dist = |11 - 2| f = 0 + (1 - 1.001^-9) = 0.009

(2, 2, 9)

app. lvl =0 b. dist = |2 - 2|

f = 0 + (1 - 1.001^0) = 0

Fitness Function

Metaheuristics and Learning

• We arrive at the conclusion (i.e. qualifying test data) by observing individual instances (roughly speaking, individual assignments). This is, in a way, opposite to model checking.

• Advances in computational power empowers both verification and validation.

• More about metaheuristic and learning in CS492 in Autumn 2016 (Search-Based Software Engineering)

applications of logic in software engineering · 2020. 5. 21. · how do you check whether your...

Documents