global value numbering using random interpretation osq retreat, may 2003 sumit gulwani george necula...

26
Global Value Numbering Using Random Interpretation OSQ Retreat, May 2003 Sumit Gulwani George Necula EECS Department University of California, Berkeley

Post on 21-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Global Value Numbering UsingRandom Interpretation

OSQ Retreat, May 2003

Sumit Gulwani George Necula

EECS DepartmentUniversity of California, Berkeley

May 15, 2003 OSQ Retreat 2003

Outline

• Value numbering on linear arithmetic (POPL ’03)

• How can we handle other operators ?– Program Analysis

• How can we handle multiple occurrences of a conditional ? – Model Checking

• How can we interpret conditionals ? (CADE ’03)– Theorem Proving

a := 0; b := 1;

a := 1; b := 0;

c := b – a; d := 1 – 2b;

assert (c + d = 0); assert (c = a + 1)

c := 2a + b; d := b – 2;

T

T F

F

Example 1

•Random testing: test the program for random inputs

• ¾ probability of unsoundness here

• 1 – (½)n in worst case

• Want the same simplicity, with better odds

•We will execute the program once, in a way that it captures the “effect” of all the paths

May 15, 2003 OSQ Retreat 2003

The Affine Join Operation

• Execute both the branches

• Combine the values of the variables at joins using the affine join operation ©w for some randomly chosen w v1 ©w v2 ´ w £ v1 + (1-w) £ v2

a := 2; b := 3;

a := 4; b := 6;

a = 2 ©7 4 b = 3 ©7 6

(w = 7)

a := 0; b := 1;

a := 1; b := 0;

c := b – a; d := 1 – 2b;

assert (c + d = 0); assert (c = a + 1)

a = -4, b = 5

a = -4, b = 5c = -39, d = 39

c := 2a + b; d := b – 2;

a = 1, b = 0a = 0, b = 1

a = -4, b = 5c = -3, d = 3

a = -4, b = 5 c = 9, d = -9

T

T F

F

w1 = 5

w2 = -3

Example 1

• Choose a random weight for each join independently.

• All choices of random weights verify the first assertion

• Almost all choices contradict the second assertion.

May 15, 2003 OSQ Retreat 2003

Outline

• Value numbering on linear arithmetic (POPL ’03)

How can we handle other operators ?– Program Analysis

• How can we handle multiple occurrences of a conditional ? – Model Checking

• How can we interpret conditionals ? (CADE ’03)– Theorem Proving

May 15, 2003 OSQ Retreat 2003

Uninterpreted Functions

• Choose random interpretations

• Non-linear interpretation– Works for basic blocks– Loss of completeness at join points

• Naïve linear interpretation– Works for join points– Loss of soundness in basic blocks

• k linear interpretations– Fixes the above problems

May 15, 2003 OSQ Retreat 2003

Non-linear interpretation

• Model F(e) as e2

• Works for basic blocks• But, incomplete for joins

a := y;

b := F(y);

c := F(a);

assert (b = c)

a := z;

b := F(z);

a = w(y) + (1-w)(z)

b = w(y2) + (1-w)(z2)

c = [w(y)+(1-w)(z)]2

= w2(y2) + (1-w)2(z2)

+ w(1-w)(2yz)

= b [only if w=w2 and

(1-w)=(1-w)2 and

w(1-w)=0]

May 15, 2003 OSQ Retreat 2003

Naïve linear interpretation

• Encode F(e1,e2) = r1e1 + r2e2

• Complete for affine joins• But, unsound for basic blocks

F

F F

a b c d

e = F

F F

a c b d

e’ =

•V(e) = V(e’) even though e e’

•too few random coefficients!

V(e) = r1(r1a+r2b)+r2(r1c+r2d)

= r12(a) + r1r2(b+c) + r2

2(d)

V(e’) = r1(r1a+r2c)+r2(r1b+r2d)

= r12(a) + r1r2 (b+c) + r2

2(d)

May 15, 2003 OSQ Retreat 2003

k linear interpretations

• Perform k runs in parallel

• Encode Fi(e1,e2) = ri,j e1j + r’i,j e2

j

• Each linear interpretation is linear in 2k terms• Choose k linear random interpretations

) 2k2 random variables

• We believe that k = n0.5; perhaps log(n)0.5

F1 Fk

e11

e12

… e1k

e21

… e2k

j=1

j=1

k k

May 15, 2003 OSQ Retreat 2003

k linear interpretations: Example (with k=2)

• V(e11) = r1(a) + r2(a) + r3(b)+ r4(b)

• V(e12) = r5(a) + r6(a) + r7(b)+ r8(b)

• V(e21) = r1(c) + r2(c) + r3(d)+ r4(d)

• V(e22) = r5(c) + r6(c) + r7(d)+ r8(d)

• V(e1) = r1[r1(a) + r2(a) + r3(b)+ r4(b)] + r2[r5(a) + r6(a) + r7(b)+ r8(b)]

+ r3[r1(c) + r2(c) + r3(d)+ r4(d)] + r4[r5(c) + r6(c) + r7(d)+ r8(d)]• V(e2) = r5[r1(a) + r2(a) + r3(b)+ r4(b)] + r6[r5(a) + r6(a) + r7(b)+ r8(b)]

+ r7[r1(c) + r2(c) + r3(d)+ r4(d)] + r8[ r5(c) + r6(c) + r7(d)+ r8(d)]

F

F F

a b c d

e =

e1 =

= e2

May 15, 2003 OSQ Retreat 2003

Outline

• Value numbering on linear arithmetic (POPL ’03)

• How can we handle other operators ?– Program Analysis

How can we handle repeated multiple occurrences of a conditional ? – Model Checking

• How can we interpret conditionals ? (CADE ’03)– Theorem Proving

May 15, 2003 OSQ Retreat 2003

Repeated Conditionals

a := 1; a := 4;

b := 2;

assert (b - a – 1 = 0)

b := 5;

T

T F

FB

B

a = w1 + 4(1-w1)

= 4 – 3w1

w1

w2

b = 2w2 + 5(1-w2) = 5 – 3w2 b-a-1 = 3w1 –

3w2

• Choose same random weights for equivalent conditionals

• Can’t really be so easy as SAT can be encoded as such a problem!

May 15, 2003 OSQ Retreat 2003

Repeated Conditionals

a := 1; a := 4;

b := a+1;

assert (b - a – 1 = 0)

b := 5;

T

T F

FB

B

w

w

b = (4-3w+1)w + 5(1-w) = 5 – 3w2

b-a-1 = 3w - 3w2

a = w + 4(1-w) = 4 – 3w

• Lost Completeness– We can verify the assert

only if w = w2, but we choose w from a large set for soundness

• Idea: Simplify the polynomial so that it does not contain terms like w2

– Need to maintain symbolic expressions

May 15, 2003 OSQ Retreat 2003

Repeated Conditionals

• A state maps a variable to a expression: E ::= n | E1 + E2 | if B then E1 else E

B ::= c | : c | B1 Æ B2 | B1 Ç B2

• Representation for expressions must satisfy:– Easy to construct representation of E from

representations of its subexpressions– Easy to verify equivalence of two expressions

• How about Multi-valued ROBDDs ?

• Free Conditional Expression DAGs (FCEDs)– Our representation

May 15, 2003 OSQ Retreat 2003

Multi-valued ROBDDs

c1

2 3

a = c2

z 6

b =a := 2;

a := 3;

b := z;

y := b + a;

b := 6;

T

T F

Fc2

c1c1

c2 c2

z+2 8 3+z 9

y =

•|D(y)| = |D(a)| * |D(b)|

•D(y) does not share nodes with D(a) and D(b)

•Need a normal form for leaves

May 15, 2003 OSQ Retreat 2003

FCEDs: Free Conditional Expression DAGs

c1

2 3

a = c2

z 6

b =a := 2;

a := 3;

b := z;

y := b + a;

b := 6;

T

T F

Fc2

c1

•|D(y)| = |D(a)| + |D(b)|

•D(y) does share nodes with D(a) and D(b)

•No need for normal form for arithmetic

+y =

FCED Construction

c1

2 3

c2

z 6

+

choose

guard guard

choose

guard guard

Plus

R(c1) 2 R(:c1) 3 R(c2) z R(:c2) 6

•D(x) = Leaf(x)

•D(n) = Leaf(n)

•D(e1+e2) = Plus (D(e1), D(e2))

•D(if b then e1 else e2) =

Choose(||R(b),D(e1)||, ||NOT R(b), D(e2)||)

Formalization

May 15, 2003 OSQ Retreat 2003

Normalize Guard Operator

•||g,f|| = Guard(g,f), if BV(g) Å BV(f) = ;

•||g, Plus(f1,f2) = Plus(||g,f1||, ||g, f2||)

•||g, Choose(f1,f2) = Choose(||g,f1||, ||g, f2||)

•||g1, Guard(g2,f)|| = Guard(g1,||g2,f||),

if BV(g1) Å BV(g2) = ;

•||g1, Guard(g2,f )|| = Guard(|| INTERSECT(g1,g2),f ||)

May 15, 2003 OSQ Retreat 2003

Example: Normalize Guard Operator

choose

guard guard

choose

guard guard

Plus

R(c1)

R(:c1) 3 R(c2)

z R(:c2) 62

guard

R(c1)

guard

R(c1)

R(c1Æc1) R(:c1Æc1)

Given f, construct ||c1,f||

May 15, 2003 OSQ Retreat 2003

Randomized Equivalence testing for FCEDs

• V(Leaf(n)) = n

• V(Leaf(x)) = rx

• V(Plus(f1,f2)) = V(f1) + V(f2)

• V(Choose(f1,f2)) = V(f1) + V(f2)

• V(Guard(g,f)) = V(g)*V(f)

• V(c(g1,g2) = rc*V(g1) + (1-rc)*V(g2)

• V(0) = 0, V(1) = 1

• V(and(g1,g2)) = V(g1)*V(g2)

• V(or(g1,g2)) = V(g1)+V(g2)

• V(c) = rc, V(:c) = 1 – rc

May 15, 2003 OSQ Retreat 2003

Outline

• Value numbering on linear arithmetic (POPL ’03)

• How can we handle other operators ?– Program Analysis

• How can we handle multiple occurrences of a conditional ? – Model Checking

How can we interpret conditionals ? (CADE ’03)– Theorem Proving

May 15, 2003 OSQ Retreat 2003

Example

a := x + y

b := a b := 2 * x

assert (b = 2x)

T FIf (x = y)

•Affine join is not enough

•We need to make use of the conditional x = y on the true branch

May 15, 2003 OSQ Retreat 2003

The Adjust Operation

• Execute multiple runs of the program in parallel

• Sample = Collection of states at each program point

• “Adjust” the sample before a conditional (by taking affine joins of the states in the sample) such that– Adjustment preserves original relationships– Adjustment satisfies the equality in the conditional

• Use adjusted sample on the true branch

May 15, 2003 OSQ Retreat 2003

Experience

• We built a randomized satisfiability procedure for linear equalities

• E.g., show that z = x + y Æ x = y ) z = 2x– Encode it as a program with “if … then … else”– We use Adjust but no Join here

• Compared with ICS (from SRI) on randomly-generated examples– Randomized algorithm 60-100 times faster (for

arith.)– Simple algorithm– Simple data structure: an array of states

(Caveat: our tool is written in C and ICS in Ocaml)

May 15, 2003 OSQ Retreat 2003

Conclusion and Future Work

• Randomization can help achieve simplicity and efficiency at the expense of making soundness probabilistic

• Other interesting possible extensions:– Combination of uninterpreted functions with

arithmetic– Partially interpreted functions like associative

functions– Memory– Inequalities