global value numbering using random interpretation osq retreat, may 2003 sumit gulwani george necula...

Global Value Numbering UsingRandom Interpretation

OSQ Retreat, May 2003

Sumit Gulwani George Necula

EECS DepartmentUniversity of California, Berkeley

May 15, 2003 OSQ Retreat 2003

Outline

• Value numbering on linear arithmetic (POPL ’03)

• How can we handle other operators ?– Program Analysis

• How can we handle multiple occurrences of a conditional ? – Model Checking

• How can we interpret conditionals ? (CADE ’03)– Theorem Proving

a := 0; b := 1;

a := 1; b := 0;

c := b – a; d := 1 – 2b;

assert (c + d = 0); assert (c = a + 1)

c := 2a + b; d := b – 2;

T

T F

F

Example 1

•Random testing: test the program for random inputs

• ¾ probability of unsoundness here

• 1 – (½)n in worst case

• Want the same simplicity, with better odds

•We will execute the program once, in a way that it captures the “effect” of all the paths


The Affine Join Operation

• Execute both the branches

• Combine the values of the variables at joins using the affine join operation ©w for some randomly chosen w v1 ©w v2 ´ w £ v1 + (1-w) £ v2

a := 2; b := 3;

a := 4; b := 6;

a = 2 ©7 4 b = 3 ©7 6

(w = 7)

a := 0; b := 1;

a := 1; b := 0;

c := b – a; d := 1 – 2b;

assert (c + d = 0); assert (c = a + 1)

a = -4, b = 5

a = -4, b = 5c = -39, d = 39

c := 2a + b; d := b – 2;

a = 1, b = 0a = 0, b = 1

a = -4, b = 5c = -3, d = 3

a = -4, b = 5 c = 9, d = -9

T

T F

F

w1 = 5

w2 = -3

Example 1

• Choose a random weight for each join independently.

• All choices of random weights verify the first assertion

• Almost all choices contradict the second assertion.


Outline


How can we handle other operators ?– Program Analysis




Uninterpreted Functions

• Choose random interpretations

• Non-linear interpretation– Works for basic blocks– Loss of completeness at join points

• Naïve linear interpretation– Works for join points– Loss of soundness in basic blocks

• k linear interpretations– Fixes the above problems


Non-linear interpretation

• Model F(e) as e2

• Works for basic blocks• But, incomplete for joins

a := y;

b := F(y);

c := F(a);

assert (b = c)

a := z;

b := F(z);

a = w(y) + (1-w)(z)

b = w(y2) + (1-w)(z2)

c = [w(y)+(1-w)(z)]2

= w2(y2) + (1-w)2(z2)

+ w(1-w)(2yz)

= b [only if w=w2 and

(1-w)=(1-w)2 and

w(1-w)=0]


Naïve linear interpretation

• Encode F(e1,e2) = r1e1 + r2e2

• Complete for affine joins• But, unsound for basic blocks

F

F F

a b c d

e = F

F F

a c b d

e’ =

•V(e) = V(e’) even though e e’

•too few random coefficients!

V(e) = r1(r1a+r2b)+r2(r1c+r2d)

= r12(a) + r1r2(b+c) + r2

2(d)

V(e’) = r1(r1a+r2c)+r2(r1b+r2d)

= r12(a) + r1r2 (b+c) + r2

2(d)


k linear interpretations

• Perform k runs in parallel

• Encode Fi(e1,e2) = ri,j e1j + r’i,j e2

j

• Each linear interpretation is linear in 2k terms• Choose k linear random interpretations

) 2k2 random variables

• We believe that k = n0.5; perhaps log(n)0.5

F1 Fk

e11

e12

… e1k

e21

… e2k

…

j=1

j=1

k k


k linear interpretations: Example (with k=2)

• V(e11) = r1(a) + r2(a) + r3(b)+ r4(b)

• V(e12) = r5(a) + r6(a) + r7(b)+ r8(b)

• V(e21) = r1(c) + r2(c) + r3(d)+ r4(d)

• V(e22) = r5(c) + r6(c) + r7(d)+ r8(d)

• V(e1) = r1[r1(a) + r2(a) + r3(b)+ r4(b)] + r2[r5(a) + r6(a) + r7(b)+ r8(b)]

+ r3[r1(c) + r2(c) + r3(d)+ r4(d)] + r4[r5(c) + r6(c) + r7(d)+ r8(d)]• V(e2) = r5[r1(a) + r2(a) + r3(b)+ r4(b)] + r6[r5(a) + r6(a) + r7(b)+ r8(b)]

+ r7[r1(c) + r2(c) + r3(d)+ r4(d)] + r8[ r5(c) + r6(c) + r7(d)+ r8(d)]

F

F F

a b c d

e =

e1 =

= e2


Outline



How can we handle repeated multiple occurrences of a conditional ? – Model Checking



Repeated Conditionals

a := 1; a := 4;

b := 2;

assert (b - a – 1 = 0)

b := 5;

T

T F

FB

B

a = w1 + 4(1-w1)

= 4 – 3w1

w1

w2

b = 2w2 + 5(1-w2) = 5 – 3w2 b-a-1 = 3w1 –

3w2

• Choose same random weights for equivalent conditionals

• Can’t really be so easy as SAT can be encoded as such a problem!



a := 1; a := 4;

b := a+1;

assert (b - a – 1 = 0)

b := 5;

T

T F

FB

B

w

w

b = (4-3w+1)w + 5(1-w) = 5 – 3w2

b-a-1 = 3w - 3w2

a = w + 4(1-w) = 4 – 3w

• Lost Completeness– We can verify the assert

only if w = w2, but we choose w from a large set for soundness

• Idea: Simplify the polynomial so that it does not contain terms like w2

– Need to maintain symbolic expressions



• A state maps a variable to a expression: E ::= n | E1 + E2 | if B then E1 else E

B ::= c | : c | B1 Æ B2 | B1 Ç B2

• Representation for expressions must satisfy:– Easy to construct representation of E from

representations of its subexpressions– Easy to verify equivalence of two expressions

• How about Multi-valued ROBDDs ?

• Free Conditional Expression DAGs (FCEDs)– Our representation


Multi-valued ROBDDs

c1

2 3

a = c2

z 6

b =a := 2;

a := 3;

b := z;

y := b + a;

b := 6;

T

T F

Fc2

c1c1

c2 c2

z+2 8 3+z 9

y =

•|D(y)| = |D(a)| * |D(b)|

•D(y) does not share nodes with D(a) and D(b)

•Need a normal form for leaves


FCEDs: Free Conditional Expression DAGs

c1

2 3

a = c2

z 6

b =a := 2;

a := 3;

b := z;

y := b + a;

b := 6;

T

T F

Fc2

c1

•|D(y)| = |D(a)| + |D(b)|

•D(y) does share nodes with D(a) and D(b)

•No need for normal form for arithmetic

+y =

FCED Construction

c1

2 3

c2

z 6

+

choose

guard guard

choose

guard guard

Plus

R(c1) 2 R(:c1) 3 R(c2) z R(:c2) 6

•D(x) = Leaf(x)

•D(n) = Leaf(n)

•D(e1+e2) = Plus (D(e1), D(e2))

•D(if b then e1 else e2) =

Choose(||R(b),D(e1)||, ||NOT R(b), D(e2)||)

Formalization


Normalize Guard Operator

•||g,f|| = Guard(g,f), if BV(g) Å BV(f) = ;

•||g, Plus(f1,f2) = Plus(||g,f1||, ||g, f2||)

•||g, Choose(f1,f2) = Choose(||g,f1||, ||g, f2||)

•||g1, Guard(g2,f)|| = Guard(g1,||g2,f||),

if BV(g1) Å BV(g2) = ;

•||g1, Guard(g2,f )|| = Guard(|| INTERSECT(g1,g2),f ||)


Example: Normalize Guard Operator

choose

guard guard

choose

guard guard

Plus

R(c1)

R(:c1) 3 R(c2)

z R(:c2) 62

guard

R(c1)

guard

R(c1)

R(c1Æc1) R(:c1Æc1)

Given f, construct ||c1,f||


Randomized Equivalence testing for FCEDs

• V(Leaf(n)) = n

• V(Leaf(x)) = rx

• V(Plus(f1,f2)) = V(f1) + V(f2)

• V(Choose(f1,f2)) = V(f1) + V(f2)

• V(Guard(g,f)) = V(g)*V(f)

• V(c(g1,g2) = rc*V(g1) + (1-rc)*V(g2)

• V(0) = 0, V(1) = 1

• V(and(g1,g2)) = V(g1)*V(g2)

• V(or(g1,g2)) = V(g1)+V(g2)

• V(c) = rc, V(:c) = 1 – rc


Outline




How can we interpret conditionals ? (CADE ’03)– Theorem Proving


Example

a := x + y

b := a b := 2 * x

assert (b = 2x)

T FIf (x = y)

•Affine join is not enough

•We need to make use of the conditional x = y on the true branch


The Adjust Operation

• Execute multiple runs of the program in parallel

• Sample = Collection of states at each program point

• “Adjust” the sample before a conditional (by taking affine joins of the states in the sample) such that– Adjustment preserves original relationships– Adjustment satisfies the equality in the conditional

• Use adjusted sample on the true branch


Experience

• We built a randomized satisfiability procedure for linear equalities

• E.g., show that z = x + y Æ x = y ) z = 2x– Encode it as a program with “if … then … else”– We use Adjust but no Join here

• Compared with ICS (from SRI) on randomly-generated examples– Randomized algorithm 60-100 times faster (for

arith.)– Simple algorithm– Simple data structure: an array of states

(Caveat: our tool is written in C and ICS in Ocaml)


Conclusion and Future Work

• Randomization can help achieve simplicity and efficiency at the expense of making soundness probabilistic

• Other interesting possible extensions:– Combination of uninterpreted functions with

arithmetic– Partially interpreted functions like associative

functions– Memory– Inequalities

global value numbering using random interpretation osq retreat, may 2003 sumit gulwani george necula...

Documents