type-directed completion of partial expressionsperelman//publications/pldi12-partial... ·...
TRANSCRIPT
Type-Directed Completion of Partial Expressions
Daniel Perelman†
Sumit Gulwani‡ Thomas Ball‡ Dan Grossman†
†University of Washington
‡Microsoft Research Redmond
June 12, 2012
I want to shrink an image...
Document image = ...; Size newSize = ...;
image.:::::::Shrink(newSize)
2 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
image.
3 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
image.
4 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
image.
5 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
image.
6 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.
7 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.
8 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.
9 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.
10 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Document.
11 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Document.
12 / 54
I want to shrink an image...
13 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Data.
14 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.
15 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.
16 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.
17 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.
18 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.ResizeAction.
19 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.
20 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.CanvasSizeAction.
21 / 54
I want to shrink an image...
Document image = ...; Size newSize = ...;
PaintDotNet.Actions.CanvasSizeAction.
.ResizeDocument(
/* PaintDotNet.Document image */,
/* System.Drawing.Size size */,
/* PaintDotNet.AnchorEdge edge */,
/* PointDotNet.ColorBgra bgColor */);
22 / 54
Programmer thought process
I I have a Document and a Size
I I want to shrink the Document
I There must be a method
I Current code completionI Left-to-rightI Complete, alphabetic list of just next tokenI Very limited filtering
23 / 54
Programmer thought process
I I have a Document and a Size
I I want to shrink the Document
I There must be a method
I Current code completionI Left-to-rightI Complete, alphabetic list of just next tokenI Very limited filtering
24 / 54
Proposed workflow
Document image = ...; Size newSize = ...;
var newImage =:?({image, newSize})
25 / 54
Proposed workflow
Document image = ...; Size newSize = ...;
var newImage =:?({image, newSize})
26 / 54
Programmer thought process
I I have a Document and a Size
I I want to shrink the Document
I There must be a method
I Query should contain what the programmer knowsI Some values and types the expression should involveI Loose syntactic structure
I Query shouldn’t require what the programmer doesn’t knowI NamesI Argument orderI Other arguments
I Show “best” results first
I Similar in spirit to Prospector [Mandelin et. al., PLDI’05]
27 / 54
Overview
I Expression of API queries as partial expressions
I Algorithm to generate results quickly in ranked order
I Experiment showing simple queries represent real code well
28 / 54
Unknown method queries
I Ex.::?({image, size})
I ⇒ PaintDotNet.Actions.CanvasSizeAction
.ResizeDocument(img, size, �, �)I ⇒ PaintDotNet.Functional.Func.Bind(�, size, img)I ⇒ PaintDotNet.Pair.Create(size, img)I ⇒ PaintDotNet.Quadruple.Create(size, img, �, �)I ⇒ PaintDotNet.Triple.Create(size, img, �)I ⇒ PaintDotNet.PropertySystem
.StaticListChoiceProperty
.CreateForEnum(img, size, �)I ⇒ System.Drawing.Size.Equals(size, img)I ⇒ System.Object.ReferenceEquals(size, img)
29 / 54
Unknown lookup queries
I Ex. float f = pointPair.::*
I ⇒ pointPair.P1.XI ⇒ pointPair.P1.YI ⇒ pointPair.P2.XI ⇒ pointPair.P2.YI ⇒ pointPair.Midpoint.XI ⇒ pointPair.Midpoint.YI ⇒ pointPair.FirstValidValue().XI ⇒ pointPair.Length
30 / 54
Unknown expression queries
I Ex. XmlReader xr =:?
I ⇒ System.Xml.XmlReader.Create(�)I ⇒ new System.Xml.XmlNodeReader(�)I ⇒ System.Data.SqlTypes.SqlXml.Null.CreateReader()I ⇒ new System.Xml.XmlNodeReader(�).ReadSubtree()I ⇒ new System.Xml.XmlValidatingReader(�).ReaderI ⇒ Microsoft.SqlServer.Server.SqlContext
.TriggerContext.EventData.CreateReader()I ⇒ new System.Xml.XmlValidatingReader(�)
.Reader.ReadSubtree()
31 / 54
Partial expression language
(a) e ::= call | varName | e.fieldName | e:=e | e<ecall ::= methodName(e1, . . . ,en)
(b) e ::= a |:? | �
a ::= e | a.:* | call | e:=e | e<e
call ::=:?({e1, . . . ,en}) | methodName(e1, . . . ,en)
I Ex.::?({strBuilder.
:*, e.
:*})
⇒::?({strBuilder, e.StackTrace})
⇒ strBuilder.Append(e.StackTrace)
32 / 54
Partial expression language
(a) e ::= call | varName | e.fieldName | e:=e | e<ecall ::= methodName(e1, . . . ,en)
(b) e ::= a |:? | �
a ::= e | a.:* | call | e:=e | e<e
call ::=:?({e1, . . . ,en}) | methodName(e1, . . . ,en)
I Ex.::?({strBuilder.
:*, e.
:*})
⇒::?({strBuilder, e.StackTrace})
⇒ strBuilder.Append(e.StackTrace)
33 / 54
Algorithm
I Problem: given query, generate completions
34 / 54
Method index by parameter type
Object
2210 methods
Equals
GetHashCode
Registry.SetValue
Array.IndexOf
IList.Add
Console.WriteLine
...
ICloneable
2211 methods
Clone
IList
2257 methods
Add
Remove
...
ArrayList
2299 methods
BinarySearch
Reverse
...
35 / 54
Infinite results
I Problem: too many resultsI inefficient to generate thousands of results to show only 20 to
the programmerI programmer does not want to look at every resultI result set is often infinite
I Ex. var res = foo.:*;
I ⇒ fooI ⇒ foo.GetType()I ⇒ foo.GetType().GetType()I ⇒ foo.GetType().GetType().GetType()I ⇒ foo.GetType().GetType().GetType().GetType()I ⇒ . . .
I Solution: generate in ranked order
36 / 54
Algorithm
I Simple structually recursive algorithm
I Group by type to minimize redundant workI Generate results in ranking order
I Allows determination of top n without computing all results
37 / 54
Heuristics: Type distance
Object
Shape
Rectangle
2
1
IDrawingElement
2
38 / 54
Heuristics: Type distance
Object
Shape
Rectangle
2
1
0
IDrawingElement
2
39 / 54
Heuristics: Length
I Number of field/property lookups or method calls added
I:?({strBuilder.
:*,e.
:*})
Good (1): ⇒ strBuilder.Append(e.StackTrace)
Bad (3): ⇒ strBuilder.Clear().Append(e.Data.Count)
40 / 54
Heuristics: Length
I Number of field/property lookups or method calls added
I:?({strBuilder.
:*,e.
:*})
Good (1): ⇒ strBuilder.Append(e.StackTrace)
Bad (3): ⇒ strBuilder.Clear().Append(e.Data.Count)
41 / 54
Heuristics: Inferred abstract types
Example usages elsewhere in codebase:string f = Path.GetTempFileName(); ...;
File.Delete(f);
File.Delete(Path.Combine(dir, filename));
if(File.Exists(Path.Combine(otherDir, file))) {...}
Query:string p = Path.GetTempFileName();
:?({p})⇒ GetCursor(p)
⇒ File.Delete(p)
⇒ File.Exists(p)
42 / 54
Ranking function
I Linear combination of these and other heuristics
I Sensitivity analysis showed these are most important andcoefficients do not matter much
43 / 54
Outline
Motivation
ApproachLanguageAlgorithmRanking
ExperimentResults
Related work
Conclusion
44 / 54
Experiment
I Automated test of expressiveness of partial expressions
I Generated queries for each call and looked at rank of actualcall in query results
I Advantage: able to do many queries
I Disadvantage: many of the method calls are not ones aprogrammer would need API discovery for
45 / 54
Experiment
I Used Microsoft CCI to disassemble mature C# projectsI Converted every call with at least 3 arguments (including
receiver) to a query with 1 or 2 arguments (including receiver)I For ResizeDocument(document, size, anchorEdge,
background) 16 queries would be generated:⇒
:?(document)
⇒:?(size)
⇒:?(anchorEdge)
⇒:?(background)
⇒:?(document, size)
⇒:?(document, background)
⇒ . . .
I Report rank for best-performing query for each call
46 / 54
Projects used
I Paint.NET image editor
I Windows Installer XML library
I Gnome Do program launcher
I Banshee music player
I .NET core libraries
I Family.Show (WPF example application)
I LiveGeometry geometry visualizer
I Scale: .NET contains 280,000 methods in 30,000 types
I Analyzed 21,176 method calls in these applications
47 / 54
CDF of rank for best method query
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d c
alls
Rank of correct answer is < x
?({foo, bar})
baz.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d c
alls
Rank of correct answer is < x
?({foo, bar})
baz.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d c
alls
Rank of correct answer is < x
?({foo, bar})
baz.
Partial expressions
Code completion
48 / 54
CDF of rank for best method query (correct is static)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d s
tati
c ca
lls
Rank of correct answer is < x
?({foo, bar})
NS.Baz.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d s
tati
c ca
lls
Rank of correct answer is < x
?({foo, bar})
NS.Baz.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d s
tati
c ca
lls
Rank of correct answer is < x
?({foo, bar})
NS.Baz.
Partial expressions
Code completion
49 / 54
CDF of rank for best method query
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d c
alls
Rank of correct answer is < x
?({foo, bar})
?({foo})
baz.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d c
alls
Rank of correct answer is < x
?({foo, bar})
?({foo})
baz.
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 10 20 30 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1P
roport
ion o
f an
alyze
d c
alls
Rank of correct answer is < x
?({foo, bar})
?({foo})
baz.Using two arguments
Using one argument
Code completion
50 / 54
Other experiments
I Time: unknown method queries take under 0.1 second
I Ran similar experiments on other partial expression templates
I Similar results: one argument or one lookup could bepredicted within the top 10 about 80% of the time
51 / 54
Related work
I Lots of other work on API discovery discussed in paper
I Prospector (for Java) [Mandelin et. al., PLDI’05]I Input is target type
I Similar to XmlReader xr =:? query
I Uses mined expressions which convert from one type to anotherI Output is chain of mined expressions starting with some local
I Advantage: able to synthesize larger expressionsI Disadvantage: queries only specify a single input type and a
single output type
52 / 54
Related work
I Lots of other work on API discovery discussed in paper
I Prospector (for Java) [Mandelin et. al., PLDI’05]I Input is target type
I Similar to XmlReader xr =:? query
I Uses mined expressions which convert from one type to anotherI Output is chain of mined expressions starting with some local
I Advantage: able to synthesize larger expressionsI Disadvantage: queries only specify a single input type and a
single output type
53 / 54
Contributions
I Expressed API searches in terms of partial expressions
I Leveraged rich type structure to reduce information neededfor queries
I Automated experiments across large codebases show smallpartial expressions often match real method calls
I Created Visual Studio pluginI https://pec.codeplex.com/
54 / 54