dimensions in synthesis sumit gulwani [email protected] microsoft research, redmond may 2012

48
Dimensions in Synthesis Sumit Gulwani [email protected] Microsoft Research, Redmond May 2012

Upload: alvin-walton

Post on 23-Dec-2015

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Dimensions in Synthesis

Sumit [email protected]

Microsoft Research, Redmond

May 2012

Page 2: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Synthesize a program in some underlying language from user intent using some search technique.

2

Program Synthesis

• Why today?– Variety of (cheap) computational devices and platforms

• Billions of non-experts have access to these devices!– Enabling technology is now available

• Better search algorithms• Faster machines (good application for multi-cores)

Page 3: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Synthesize a program in some underlying language from user intent using some search technique.

3

Program Synthesis

• Why today?– Variety of (cheap) computational devices and platforms

• Billions of non-experts have access to these devices!– Enabling technology is now available

• Better search algorithms• Faster machines (good application for multi-cores)

Page 4: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

4

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

(Application)

(Ambiguity)

(Algorithm)

Page 5: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

5

Compilers vs. Synthesizers

Dimension

Compilers Synthesizers

Concept Language

Executable Program

Variety of concepts: Program, Automata, Query, Sequence

User Intent Structured language

Variety/mixed form of constraints: logic, examples, traces

Search Technique

Syntax-directed translation (No new algorithmic insights)

Uses some kind of search (Discovers new algorithmic insights)

Page 6: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

 

Students and Teachers

End-Users

Algorithm Designers

Software Developers

Most Transformational Target

Potential Users of Synthesis Technology

6

Most Useful Target

• Vision for End-users: Enable people to have (automated) personal assistants.

• Vision for Education: Enable every student to have access to free & high-quality education.

Page 7: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Lecture 1: Algorithms• Synthesis of Straight-line Programs from Logic

– Bit-vector Algorithms– Geometry Constructions

Lecture 2: Applications• Intelligent Tutoring Systems

Lecture 3: Ambiguity• Synthesis from Examples & Keywords

7

Organization

Page 8: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Intelligent Tutoring Systems

Technical Goals:• Identify a useful task that can be formalized as

a synthesis problem.• Propose an appropriate user interaction model.• Propose an appropriate search technique.

8

Lab

Page 9: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Synthesizing Bitvector Algorithms

PLDI 2011: Gulwani, Jha, Tiwari, Venkatesan

Page 10: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

10

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Page 11: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Straight-line programs that use – Arithmetic Operators: +,-,*,/– Logical Operators: Bitwise and/or/not, Shift left/right

11

Bitvector Algorithms

Page 12: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

1 0 1 0 1 1 0 0

Turn-off rightmost 1-bit

12

Examples of Bitvector Algorithms

1 0 1 0 1 1 0 0

1 0 1 0 1 0 0 0

Z

Z & (Z-1)

1 0 1 0 1 0 1 1

Z

Z-1

1 0 1 0 1 0 0 0

&

Z & (Z-1)

Page 13: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

13

Examples of Bitvector Algorithms

Turn-off rightmost contiguous sequence of 1-bits

Z

Z & (1 + (Z | (Z-1)))

1 0 1 0 1 1 0 0 1 0 1 0 0 0 0 0

Ceil of average of two integers without overflowing

(Y|Z) – ((Y©Z) >> 1)

Page 14: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

14

Examples of Bitvector Algorithms

Higher order half of product of x and yo1 := and(x,0xFFFF);o2 := shr(x,16);o3 := and(y,0xFFFF);o4 := shr(y,16);o5 := mul(o1,o3);o6 := mul(o2,o3);o7 := mul(o1,o4);o8 := mul(o2,o4);o9 := shr(o5,16);o10 := add(o6,o9);o11 :=

and(o10,0xFFFF);o12 := shr(o10,16);o13 := add(o7,o11);o14 := shr(o13,16);o15 := add(o14,o12);res := add(o15,o8);

Round up to nexthighest power of 2o1 := sub(x,1);o2 := shr(o1,1);o3 := or(o1,o2);o4 := shr(o3,2);o5 := or(o3,o4);o6 := shr(o5,4);o7 := or(o5,o6);o8 := shr(o7,8);o9 := or(o7,o8);o10 := shr(o9,16);o11 := or(o9,o10);res := add(o10,1);

Page 15: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given:• Specification of desired

functionality• Specification of library components

Synthesize a straight-line program

15

Problem Definition

where• Each variable in is either or some where

k<j• is a permutation of 1...n

that meets the desired specification.

VerificationConstraint

Page 16: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Specification of desired functionality

• Specification of library components

16

Problem Definition: Turn-off rightmost 1 bit

Page 17: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

17

Synthesis Constraint

VerificationConstraint

SynthesisConstraint

Page 18: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

represents which component goes on which location (line #) and from which location does it gets its input arguments. We encode this by location variables L.

18

Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order

Page 19: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

19

Example: Possible programs that use 2 components and their Representation using

Location Variables

Page 20: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Consistency Constraint: Every line in the program should have at most one component.

20

Encoding Well-formedness of Programs

• Acyclicity Constraint: A variable should be initialized before being used.

The following constraint ensures that L assignments correspond to well-formed programs.

Page 21: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

21

Encoding data-flow

The following constraint describes connections between inputs and outputs of various components.

Page 22: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

22

Idea # 1: Reduce Second-order Quantification in Synthesis Constraint to First Order

Page 23: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Synthesis constraint is of the form: 9L 8Y F(L,Y)

Finite Synthesis Step9L F(L,y1) Æ … Æ F(L,yn)

Verification StepDoes 8Y F(S,Y) hold?Or, equivalently 9Y :F(S,Y)

Solution Y = yn+1

return S 23

Choose some values y1,..,yn for y

Solution L = S

Failure

No Solution

No Solution

Idea # 2: Using CEGIS style procedure to solve the Synthesis Constraint

Page 24: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Experiments: Comparison with Brute-force Search

24

Program Brahma AHAtimeNam

elines

iters time

P1 2 2 3 0.1

P2 2 3 3 0.1

P3 2 3 1 0.1

P4 2 2 3 0.1

P5 2 3 2 0.1

P6 2 2 2 0.1

P7 3 2 1 2

P8 3 2 1 1

P9 3 2 6 7

P10 3 14 76 10

P11 3 7 57 9

P12 3 9 67 10

Program Brahma AHAtime

Name lines

iters time

P13 4 4 6 X

P14 4 4 60 X

P15 4 8 119 X

P16 4 5 62 X

P17 4 6 78 109

P18 6 5 46 X

P19 6 5 35 X

P20 7 6 108 X

P21 8 5 28 X

P22 8 8 279 X

P23 10 8 1668 X

P24 12 9 224 X

P25 16 11 2779 X

Page 25: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Synthesizing Geometry Constructions

PLDI 2011: Gulwani, Korthikanti, Tiwari.

Page 26: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

26

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Page 27: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

27

Ruler/Compass based Geometry Constructions

X

Z

Y

L1 L2

N

C

Page 28: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Draw a regular hexagon given a side.

• Given 3 parallel lines, draw an equilateral triangle whose vertices lie on the parallel lines.

• Given 4 points, draw a square whose sides contain those points.

28

Other Examples of Geometry Constructions

Page 29: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Good platform for teaching logical reasoning.

– Visual Nature:• Makes it more accessible.• Exercises both logical/visual abilities of left/right

brain.

– Fun Aspect: • Ruler/compass restrictions make it fun, as in

sports.

• Application in dynamic geometry or animations.– “Constructive” geometry macros (unlike numerical

methods) enable fast re-computation of derived objects from free (moving) objects.

29

Significance

Page 30: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Types: Point, Line, Circle

Methods:• Ruler(Point, Point) -> Line • Compass(Point, Point) -> Circle• Intersect(Circle, Circle) -> Pair of Points• Intersect(Line, Circle) -> Pair of Points• Intersect(Line, Line) -> Point

Geometry Program: A straight-line composition of the above methods.

30

Programming Language for Geometry Constructions

Page 31: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

31

Example Problem: Program

1. C1 = Compass(X,Y);2. C2 = Compass(Y,X);3. <P1,P2> =

Intersect(C1,C2);4. L1 = Ruler(P1,P2);5. D1 = Compass(Z,X);6. D2 = Compass(X,Z);7. <R1,R2> =

Intersect(D1,D2);8. L2 = Ruler(R1,R2);9. N = Intersect(L1,L2);10.C = Compass(N,X);

X

Z

Y

C1

C2P1

P2

L1

D2

D1 R1

R2

L2

N

C

Page 32: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Conjunction of predicates over arithmetic expressions

Predicates p := e1 = e2

| e1 e2

| e1 · e2

Arithmetic Expressions e := Distance(Point, Point) | Slope(Point, Point) | e1 § e2

| c32

Specification Language for Geometry Programs

Page 33: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

Precondition: Slope(X,Y) Slope(X,Z) Æ Slope(X,Y) Slope(Z,X)

Postcondition: LiesOn(X,C) Æ LiesOn(Y,C) Æ LiesOn(Z,C)

Where LiesOn(X,C) ´ Distance(X,Center(C)) = Radius(C)

Example Problem: Precondition/Postcondition

33

Page 34: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Let P be a geometry program that computes outputs O from inputs I.

• Verification Problem: Check the validity of the following Hoare triple.

Assume Pre(I); P

Assert Post(I,O);

• Synthesis Problem: Given Pre(I), Post(I,O), find P such that the above Hoare triple is valid.

34

Verification/Synthesis Problem for Geometry Programs

Page 35: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Pre(I), P, Post(I,O)

a) Symbolic decision procedures are complex.

35

Approaches to Verification Problem

Page 36: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Problem: Given two polynomials P1 and P2, determine whether they are equivalent.

• The naïve deterministic algorithm of expanding polynomials to compare them term-wise is exponential.

• A simple randomized test is probabilistically sufficient:– Choose random values r for polynomial variables x– If P1(r) ≠ P2(r), then P1 is not equivalent to P2.– Otherwise P1 is equivalent to P2 with high

probability,

36

Randomized Polynomial Identity Testing

Page 37: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Pre(I), P, Post(I,O)

a) Symbolic decision procedures are complex.

b) New efficient approach: Random Testing!1. Choose I’ randomly from the set { I | Pre(I) }.2. Compute O’ := P(I’).3. If O’ satisfies Post(I’,O’) output “Verified”.

Correctness Proof of (b):• Objects constructed by P can be described using

polynomial ops (+,-,*), square-root & division operator.

• The randomized polynomial identity testing algorithm lifts to square-root and division operators as well !

37

Approaches to Verification Problem

Page 38: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Synthesis Algorithm: // First obtain a random input-output example.1. Choose I’ randomly from the set { I | Pre(I) }.2. Compute O’ s.t. Post(I’,O’) using numerical

methods.// Now obtain a construction that can generate O’ from I’ (using exhaustive search).3. S := I’;4. While (S does not contain O’)5. S := S [ { M(O1,O2) | Oi 2 S, M 2 Methods }

6. Output construction steps for O’.

38

Idea 1 (from Theory): Symbolic Reasoning -> Concrete

Page 39: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

39

Error Probability of the algorithm is extremely low.

…L1 = Ruler(P1,P2); …L2 = Ruler(R1,R2);N = Intersect(L1,L2);C = Compass(N,X);

39

• For an equilateral 4XYZ, incenter coincides with circumcenter N.

• But what are the chances of choosing a random 4XYZ to be an equilateral one?

X

Z

Y

L1 L2

N

C

Page 40: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Synthesis algorithm times out because programs are large.

• Identify a library of commonly used patterns (pattern = “sequence of geometry methods”)– E.g., perpendicular/angular bisector, midpoint, tangent, etc.

S := S [ { M(O1,O2) | Oi 2 S, M 2 Methods }

S := S [ { M(O1,O2) | Oi 2 S, M 2 LibMethods }

• Two key advantages:– Search space: large depth -> small depth– Easier to explain solutions to students.

40

Idea 2 (from PL): High-level Abstractions

Page 41: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

41

Use of high-level abstractions reduces program size

1. C1 = Compass(X,Y);2. C2 = Compass(Y,X);3. <P1,P2> =

Intersect(C1,C2);4. L1 = Ruler(P1,P2);5. D1 = Compass(Z,X);6. D2 = Compass(X,Z);7. <R1,R2> =

Intersect(D1,D2);8. L2 = Ruler(R1,R2);9. N = Intersect(L1,L2);10.C = Compass(N,X);

1. L1 = PBisector(X,Y);2. L2 = PBisector(X,Z);3. N = Intersect(L1,L2);4. C = Compass(N,X);

Page 42: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Synthesis algorithm is inefficient because the search space is too wide and hence still huge.

• Prune forward search by using A* style heuristics.

S := S [ { M(O1,O2) | Oi 2 S, M 2 LibMethods }

S := S [ {M(O1,O2) | Oi2S, M2LibMethods, IsGood(M(O1,O2)) }

• Example: If a method constructs a line L that passes through a desired output point, then L is “good” (i.e., worth constructing).

42

Idea 3 (from AI): Goal Directed Search

Page 43: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

Given a triangle XYZ, construct circle C such that C passes through X, Y, and Z.

43

Effectiveness of Goal-directed search

43

• L1 and L2 are immediately constructed since they pass through output point N.

• On the other hand, other lines like angular bisectors are not eagerly constructed.

X

Z

Y

L1 L2

N

C

Page 44: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

25 benchmark problems.

• such as: Construct a square whose extended sides pass through 4 given points.

• 18 problems less than 1 second. 4 problems between 1-3 seconds. 3 problems 13-82 seconds.

• Idea 2 (high-level abstractions) reduces programs of size 3-45 to 3-13.

• Idea 3 (goal-directedness) improves performance by factor of 10-1000 times on most problems. 44

Experimental Results

Page 45: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

45

Search space Exploration: With/without goal-directness

Page 46: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Concept Language– Programs

• Straight-line programs– Automata– Queries– Sequences

• User Intent– Logic, Natural Language– Examples, Demonstrations/Traces

• Search Technique– SAT/SMT solvers (Formal Methods)– A*-style goal-directed search (AI)– Version space algebras (Machine Learning)

46

Dimensions in Synthesis

PPDP 2010: “Dimensions in Program Synthesis”, Gulwani.

Page 47: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Lecture 2– Section 4 in WAMBSE 2012 keynote paper

“Synthesis from Examples”, Gulwani.

• Lab– Section 4 in WAMBSE 2012 keynote paper.– NCERT Online Book Website. http://ncert.nic.in/NCERTS/textbook/textbook.htm

• Lecture 3– Sections 1-3 in WAMBSE 2012 keynote paper

47

Optional Advance Preparation

Page 48: Dimensions in Synthesis Sumit Gulwani sumitg@microsoft.com Microsoft Research, Redmond May 2012

• Motivation– Online learning sites: Khan academy, Edx, Udacity,

Coursera• Increasing class sizes with even less personal attention

– New technologies: Tablets/Smartphones, NUI, Cloud• Various Aspects

– Solution Generation– Problem Generation – Automated Grading– Content Entry

• Various Domains– K-12: Mathematics, Physics, Chemistry– Undergraduate: Introductory Programming, Automata

Theory – Language Learning 48

Intelligent Tutoring Systems