synthesizing and repairing expressions using types and weights
DESCRIPTION
Synthesizing and Repairing Expressions using Types and Weights. Tihomir Gvero, Viktor Kuncak , Ivan Kuraj and Ruzica Piskac. Motivation. Large APIs and libraries classes in Java 6.0 standard library Using those APIs (for the first time) can be Tedious Time consuming - PowerPoint PPT PresentationTRANSCRIPT
1
Synthesizing and Repairing Expressions using Types and
WeightsTihomir Gvero, Viktor Kuncak, Ivan Kuraj and Ruzica Piskac
2
Motivation
• Large APIs and libraries classes in Java 6.0 standard library
• Using those APIs (for the first time) can be • Tedious• Time consuming
• Developers should focus on solving creative tasks • Manual Solution
• Read Documentation• Inspect Examples
• Automation = Code synthesis + Code completion
3
Our Solution• InSynth: Interactive Synthesis of Code Snippets• Input:
• Scala partial program• Cursor point
• We automatically extract: • Declarations in scope (with/without statistics from corpus)• Desired type
• Algorithm• Complete• Efficient – output N expressions in less than T ms• Effective – favor useful expressions over obscure ones• Generates expressions with higher order functions
• Output• Ranked list of expressions
4
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = …}
Sequence of Streams
5
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = …}
Sequence of Streams
new SeqInStr(new FileInStr(sig), new FileInStr(sig))new SeqInStr(new FileInStr(sig), new FileInStr(body))new SeqInStr(new FileInStr(body), new FileInStr(sig))new SeqInStr(new FileInStr(body), new FileInStr(body))new SeqInStr(new FileInStr(sig), System.in)
6
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = …}
Sequence of Streams
new SeqInStr(new FileInStr(sig), new FileInStr(sig))new SeqInStr(new FileInStr(sig), new FileInStr(body))new SeqInStr(new FileInStr(body), new FileInStr(sig))new SeqInStr(new FileInStr(body), new FileInStr(body))new SeqInStr(new FileInStr(sig), System.in)
7
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr(new FileInStr(sig), new FileInStr(body)) …}
Sequence of Streams
8
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr(new FileInStr(sig), new FileInStr(body)) …}
Sequence of Streams
Imported over 3300 declarations
Executed in less than 250ms
9
def filter(p: Tree => Boolean): List[Tree] = { val ft:FilterTreeTraverser = ft.traverse(tree) ft.hits.toList}
TreeFilter (HOF)
10
def filter(p: Tree => Boolean): List[Tree] = { val ft:FilterTreeTraverser = ft.traverse(tree) ft.hits.toList}
TreeFilter (HOF)
new FilterTreeTraverser(x => p(x))new FilterTreeTraverser(x => isType)new FilterTreeTraverser(x => p(tree))new FilterTreeTraverser(x => new Wrapper(x).isType)new FilterTreeTraverser(x => p(new Wrapper(x).tree))
11
def filter(p: Tree => Boolean): List[Tree] = { val ft:FilterTreeTraverser = ft.traverse(tree) ft.hits.toList}
TreeFilter (HOF)
new FilterTreeTraverser(x => p(x))new FilterTreeTraverser(x => isType)new FilterTreeTraverser(x => p(tree))new FilterTreeTraverser(x => new Wrapper(x).isType)new FilterTreeTraverser(x => p(new Wrapper(x).tree))
12
def filter(p: Tree => Boolean): List[Tree] = { val ft:FilterTreeTraverser = new FilterTreeTraverser(x => p(x)) ft.traverse(tree) ft.hits.toList}
TreeFilter (HOF)
13
def filter(p: Tree => Boolean): List[Tree] = { val ft:FilterTreeTraverser = new FilterTreeTraverser(x => p(x)) ft.traverse(tree) ft.hits.toList}
TreeFilter (HOF)
Imported over 4000 declarations
Executed in less than 300ms
14
COMPLETION = INHABITATION
15
COMPLETION = INHABITATION
val a: T = ?
def m1: T1
…def mn: Tn
16
COMPLETION = INHABITATION
val a: T = ?
def m1: T1
…def mn: Tn G={ m1: T1,…, mn: Tn}
17
COMPLETION = INHABITATION
val a: T = ?
def m1: T1
…def mn: Tn G={ m1: T1,…, mn: Tn}
ENVIRONMENT
18
COMPLETION = INHABITATION
val a: T = ? G ⊢ ? : T
def m1: T1
…def mn: Tn G={ m1: T1,…, mn: Tn}
ENVIRONMENT
19
COMPLETION = INHABITATION
val a: T = ? G ⊢ ? : T
def m1: T1
…def mn: Tn G={ m1: T1,…, mn: Tn}
DESIRED TYPE
ENVIRONMENT
20
Simply Typed Lambda Calculus
G ⊢ x : TAX
x : T G
G ⊢ e1(e2) : TAPP
G ⊢ e1 : T1T G e⊢ 2 : T1
G ⊢ λx.t : T1 TABS
G, x : T1 ⊢ t : T
21
Simply Typed Lambda Calculus
22
Simply Typed Lambda Calculus
G ⊢ ? : T
23
Simply Typed Lambda Calculus
G ⊢ ? : T
Backward Search
24
Simply Typed Lambda Calculus
G ⊢ ? : TAPP
G ⊢ ? : T1T G ⊢ ? : T1
25
Simply Typed Lambda Calculus
G ⊢ ? : TAPP
G ⊢ ? : T1T G ⊢ ? : T1
Infinitely many
26
Simply Typed Lambda Calculus
G ⊢ ? : TAPP
G ⊢ ? : T1T G ⊢ ? : T1
Infinitely many
No bound on types in derivation tree(s).
27
G ⊢ f(a1,…,an):TAPP
f : T1…TnT G G a⊢ 1: T1 … G a⊢ n: Tn
G ⊢ λ x1:T1 ,…, xn:Tn.t: T1 … Tn TABS
G, x1:T1 ,…, xn:Tn t: T⊢
Long Normal Form
28
G ⊢ f(a1,…,an):TAPP
f : T1…TnT G G a⊢ 1: T1 … G a⊢ n: Tn
G ⊢ e1(e2) : TAPP
G ⊢ e1 : T1T G e⊢ 2 : T1
Comparison between LNF and classic APPOLD
NEW
29
G ⊢ f(a1,…,an):TAPP
f : T1…TnT G G a⊢ 1: T1 … G a⊢ n: Tn
G ⊢ e1(e2) : TAPP
G ⊢ e1 : T1T G e⊢ 2 : T1
We derive EXPRESSION from G
Comparison between LNF and classic APP
30
G ⊢ f(a1,…,an):TAPP
f : T1…TnT G G a⊢ 1: T1 … G a⊢ n: Tn
G ⊢ e1(e2) : TAPP
G ⊢ e1 : T1T G e⊢ 2 : T1
We derive EXPRESSION from G
DECLARATION from G
Comparison between LNF and classic APP
31
Long Normal Form
32
Long Normal Form
G ⊢ ? :T
33
Long Normal Form
APPf : (T1 T2)T G G ⊢ ? : T1 T2
G ⊢ f(?):T
34
Long Normal Form
APPf : (T1 T2)T G G ⊢ ? : T1 T2
Only one
G ⊢ f(?):T
Narrows the search space
35
Long Normal Form
APPf : (T1 T2)T G G ⊢ λ x1:T1.? : T1 T2
G, x1:T1 ? : T⊢ 2
ABS
G ⊢ f(λ x1:T1.?):T
36
Long Normal Form
APPf : (T1 T2)T G G ⊢ λ x1:T1.e : T1 T2
G, x1:T1 e : T⊢ 2
. . . .
ABS
APP
G ⊢ f(λ x1:T1.e):T
37
Long Normal Form
APPf : (T1 T2)T G G ⊢ λ x1:T1.e : T1 T2
G, x1:T1 e : T⊢ 2
. . . .
ABS
APP
Finitely many types in derivation tree(s)
G ⊢ f(λ x1:T1.e):T
38
Algorithm• Algorithm builds finite graph (with cycles) that
• Represents all (infinitely many) solutions• Later we use it to construct expressions• Less than 10ms
• Algorithm Properties• Graph generation terminates
• Type inhabitation is decidable• Complete - generates all solutions• PSPACE-complete
• Techniques• Succinct type representation• Backward search• Weights mechanism
39
Weights and Corpus• Weight of a declaration based on:
• Frequency • Corpus based on 18 Scala projects (e.g. Scala compiler)• Over 7500 declarations, and over 90000 uses• Higher the frequency, lower the weight
• Proximity
Local symbolsMethod and field symbols
API symbols
Low High
40
Algorithm with Weights
APPf : (T1 T2)T G G ⊢ λ x1:T1.e : T1 T2
G, x1:T1 e : T⊢ 2
. . . .
ABS
APP
G ⊢ f(λ x1:T1.e):T
41
Algorithm with Weights
APPf : (T1 T2)T G G ⊢ λ x1:T1.e : T1 T2
G, x1:T1 e : T⊢ 2
. . . .
ABS
APP
G ⊢ f(λ x1:T1.e):T
Choice based on WEIGHT
42
Algorithm with Weights
APPf : (T1 T2)T G G ⊢ λ x1:T1.e : T1 T2
G, x1:T1 e : T⊢ 2
. . . .
ABS
APP
G ⊢ f(λ x1:T1.e):T
Choice based on WEIGHT
Ranking based on w(f(λ x1:T1.e)) = w(f) + w(x1) + w(e)
43
Benchmarks
• 50 Java examples translated into Scala• Illustrate correct usage of API functions
• We generalized the import statements • To include more declarations
• In every example:1. Arbitrarily chose some expression2. Removed it 3. Marked it as goal expression4. Measure whether InSynth can recover it
44
Results
• Without weights expected expression appears • Among top 10 suggestions in only 4 benchmarks (8%)
• With weights (only proximity)• Among top 10 suggestions in 48 benchmarks (96%)• As a top suggestion in 26 benchmarks (52%)
• With weights (proximity + frequency)• Among top 10 suggestions in 48 benchmarks (96%)• As a top suggestion in 32 benchmarks (64%)
• Average execution time 145ms
45
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = …}
Sequence of Streams
46
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr( sig, body) …}
Sequence of Streams
47
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr( sig, body) …}
Sequence of StreamsUser Writes
48
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr( sig, body) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
Type Mismatch
49
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr( sig, body) …}
We propose polynomial algorithm that finds the best repair
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
Type Mismatch
50
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr( sig, body) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
Backbone Expression
51
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt“ var inStream:SeqInStr = (?) new SeqInStr( (?) sig, (?) body) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
52
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt“ var inStream:SeqInStr = (?) new SeqInStr((x. new FileInStr(x)) sig, (?) body) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
Synthesize
53
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt“ var inStream:SeqInStr = (?) new SeqInStr((x. new FileInStr(x)) sig, (?) body) …}
Use G to synthesize function:(x:String. new FileInStr(x)) : String FileInStr
Constraint: Function “body” must contain exactly one variable “x”
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
54
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt“ var inStream:SeqInStr = (?) new SeqInStr(new FileInStr(sig), (?) body) …}
Use G to synthesize function:(x:String. new FileInStr(x)) : String FileInStr
Constraint: Function “body” must contain exactly one variable “x”
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
55
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt“ var inStream:SeqInStr = (?) new SeqInStr(new FileInStr(sig), (x. new FileInStr(x)) body) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
56
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = (?) new SeqInStr(new FileInStr(sig), new FileInStr(body)) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
57
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = (x. x) new SeqInStr(new FileInStr(sig), new FileInStr(body)) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
58
def main(args:Array[String]) = { var body:String = "email.txt" var sig:String = "signature.txt" var inStream:SeqInStr = new SeqInStr(new FileInStr(sig), new FileInStr(body)) …}
// new SeqInStr: FileInStr FileInStr SeqInStr
Sequence of Streams
59
def main(args:Array[String]) = { var off:Int = 8 var len:Int = 512 var size:Int = 1024 var b:Array[Byte] = args(0).getBytes()
var inStream:ByteArrayInStr = b …}
Byte Array Stream
60
def main(args:Array[String]) = { var off:Int = 8 var len:Int = 512 var size:Int = 1024 var b:Array[Byte] = args(0).getBytes()
var inStream:ByteArrayInStr = (?) b …}
Byte Array Stream
61
def main(args:Array[String]) = { var off:Int = 8 var len:Int = 512 var size:Int = 1024 var b:Array[Byte] = args(0).getBytes()
var inStream:ByteArrayInStr = (x. new ByteArrayInStr(x, off, len)) b …}
Byte Array Stream
62
def main(args:Array[String]) = { var off:Int = 8 var len:Int = 512 var size:Int = 1024 var b:Array[Byte] = args(0).getBytes()
var inStream:ByteArrayInStr = new ByteArrayInStr(b, off, len) …}
Byte Array Stream
63
Conclusion
• Code Completion = Type Inhabitation• InSynth: Interactive Synthesis of Code Snippets• Eclipse plugin (part of Scala IDE EcoSystem)• Website
http://lara.epfl.ch/w/insynth
• Repairing Code:• Polynomial Algorithm the finds the best solution
64
Thank you!