2018-05-31 stanford cs379c xinyun chen (uc berkeley) · •ifttt programs (nips 2016) [1]...
TRANSCRIPT
DEEP LEARNING FORPROGRAM SYNTHESIS
Xinyun Chen (UC Berkeley)Stanford CS379C
2018-05-31
1
• Program synthesis from natural language descriptions• —— Towards end-user friendly programming• IFTTT programs (NIPS 2016) [1]
• Program synthesis from input-output examples• —— Towards better generalization and synthesizing more complex programs• Neural parser synthesis (ICLR 2018) [2]
• Program synthesis for software engineering applications• Neural program translation [3]
2
[1] Xinyun Chen, Chang Liu, Richard Shin, Dawn Song, Mingcheng Chen, Latent Attention for If-Then Program Synthesis, NIPS 2016.[2] Xinyun Chen, Chang Liu, Dawn Song, Towards Synthesizing Complex Programs from Input-Output Examples, ICLR 2018.[3] Xinyun Chen, Chang Liu, Dawn Song, Tree-to-tree Neural Networks for Program Translation, ICLR 2018 Workshop.
Overview
• Program synthesis from natural language descriptions• —— Towards end-user friendly programming• IFTTT programs (NIPS 2016) [1]
• Program synthesis from input-output examples• —— Towards better generalization and synthesizing more complex programs• Neural parser synthesis (ICLR 2018) [2]
• Program synthesis for software engineering applications• Neural program translation [3]
3
[1] Xinyun Chen, Chang Liu, Richard Shin, Dawn Song, Mingcheng Chen, Latent Attention for If-Then Program Synthesis, NIPS 2016.[2] Xinyun Chen, Chang Liu, Dawn Song, Towards Synthesizing Complex Programs from Input-Output Examples, ICLR 2018.[3] Xinyun Chen, Chang Liu, Dawn Song, Tree-to-tree Neural Networks for Program Translation, ICLR 2018 Workshop.
Overview
4
Neural Parser Synthesis
• Task: learning a parser as input-output program synthesis
5
Existing Approaches
• End-to-end neural networks: sequence-to-sequence based models• Do not generalize well.• Require a lot of training samples.• In our evaluation, we demonstrate that the test accuracies of this type of models are 0%
when the test samples are longer than training ones.
• Neural symbolic program synthesis• R3NN [4], RobustFill [5].• Expressiveness of the program DSL is limited.• The lengths of the synthesized programs are up to 20.
RobustFill model architecture [5][4] Parisotto et al. Neural-Symbolic Program Synthesis, ICLR 2017.[5] Delvlin et al. RobustFill: Neural Program Learning under Noisy I/O, ICML 2017.
6
Existing Approaches
• NPI-like approaches• Training requires supervision on execution traces.• The complexity of the learned algorithm is limited.
[6] Scott Reed, Nando de Freitas, Neural Programmer-Interpreters, ICLR 2016.[7] Jonathon Cai, Richard Shin, Dawn Song, Making Neural Programming Architectures Generalize via Recursion, ICLR 2017.
Neural Programmer-Interpreter [6, 7]
7
Goals
• Supervision on I/O pairs only• No supervision on execution traces
• Full generalization• 100% accuracy on arbitrarily long inputs
• Train with a few examples
8
Our Approach
•Differential neural programs operating a non-differentiable machine
Id Id
x y
Add
Parsing Machine
Neural Parsing Program
…
9
LL Machine
+z
Input Stream
Id
xT1
Id
yT2
1
0 (Id, T1) (+,+)
(Id, T2)
Stack ShiftReduce
CallReturn
Parser Functionality
Stack Operation
State Instruction Set
Termination Final
10
LL Machine
Output prediction → Trace predictionProvide constraints on the learned parsing programs
11
Differential Neural Program
• Given the current state of the machine, predict which instruction to be executed next• Using LSTM to predict instruction types and arguments• Prediction is only based on stack top and the next token
Recursion!
12
Challenge: execution traces are unknown!
• The space of possible execution traces is very large• For a very simple input (e.g., of length 3), it requires 9 instructions to construct the parse tree.
• Policy gradient could easily get stuck at a local optimum.
Input length
Number of shortest valid execution traces (under-estimation)
3 1572
5 2,771,712
7 7,458,826,752
An example of the alternative trace that leads to the correct output.
For illustration purposes, here we consider the grammar that includes only addition and multiplication, which is a small subset of the grammars in our evaluation.
13
Reinforcement Learning-based Two-phase Search Algorithm
Instruction Argument
Phase I Phase II
Execution traces
Instruction traces
x+y Id Id
x y
Add
x*y Id Id
x y
Mul
No supervision on the traces
Weak supervision on instruction traces
Valid instruction traces
found in Phase I
14
Reinforcement Learning-based Two-phase Search Algorithm
• Two-phase learning• Input-output pairs only learning: use policy gradient to sample instruction traces, and
rely on weakly supervised learning to verify if the trace is good or not.• Weakly supervised learning: assuming instruction traces are provided, train the
argument prediction networks (policy-gradient with specially designed reward functions).
No supervision on the traces
Weak supervision on instruction traces
Valid instruction traces
found in Phase I
15
Evaluation
16
Evaluation
17
Takeaways
• Neural programs operating a non-differentiable machine can achieve 100% accuracy on test inputs with length 500× longer than training inputs, while an end-to-end neural network’s accuracy is 0%.
• The design of the non-differentiable machine is crucial to regularize the programs that can be synthesized, and leveraging reinforcement learning algorithms is the key to train a neural network to learn complex programs.
• Program synthesis from natural language descriptions• —— Towards end-user friendly programming• IFTTT programs (NIPS 2016) [1]
• Program synthesis from input-output examples• —— Towards better generalization and synthesizing more complex programs• Neural parser synthesis (ICLR 2018) [2]
• Program synthesis for software engineering applications• Neural program translation [3]
18
[1] Xinyun Chen, Chang Liu, Richard Shin, Dawn Song, Mingcheng Chen, Latent Attention for If-Then Program Synthesis, NIPS 2016.[2] Xinyun Chen, Chang Liu, Dawn Song, Towards Synthesizing Complex Programs from Input-Output Examples, ICLR 2018.[3] Xinyun Chen, Chang Liu, Dawn Song, Tree-to-tree Neural Networks for Program Translation, ICLR 2018 Workshop.
Overview
19
If-Then Program Synthesis
Post photos in your Dropbox folder to Instagram
If (Dropbox.New_Photos)Then Instagram.Post_Photo
Function
Channel
IFTTT.com
Zapier.com
trigger
action
20
Motivation of Latent Attention
Post photos in your Dropbox folder to Instagram
If (Dropbox.New_Photos) Then Instagram.Post_Photo
ThenIf
If attention Then attention
21
Motivation of Latent Attention
Post photos in your Dropbox folder to Instagram
If (Dropbox.New_Photos) Then Instagram.Post_Photo
ThenIf
22
Latent Attention
Post photos in your Dropbox folder to Instagram
If (Dropbox.New_Photos) Then Instagram.Post_Photo
ThenIf
Latent attention
23
Results
[8] Quirk et al. Language to code: Learning semantic parsers for if-this-then-that recipes, ACL 2015.[9] Beltagy et al. Improved Semantic Parsers For If-Then Statements, ACL 2016.[10] Dong et al. Language to logical form with neural attention, ACL 2016.[11] Alvarez-Melis et al. Tree-structured decoding with doubly-recurrent neural networks, ICLR 2017.[12] Yin et al. A Syntactic Neural Model for General-Purpose Code Generation, ACL 2017.
24
Examples of the attention weights
25
Takeaways
• Natural language is a convenient way for human beings to describe the functionality of the synthesized programs.
• Deep neural networks are promising for handling noisy inputs (e.g. with typos).
• Program synthesis from natural language descriptions• —— Towards end-user friendly programming• IFTTT programs (NIPS 2016) [1]
• Program synthesis from input-output examples• —— Towards better generalization and synthesizing more complex programs• Neural parser synthesis (ICLR 2018) [2]
• Program synthesis for software engineering applications• Neural program translation [3]
26
[1] Xinyun Chen, Chang Liu, Richard Shin, Dawn Song, Mingcheng Chen, Latent Attention for If-Then Program Synthesis, NIPS 2016.[2] Xinyun Chen, Chang Liu, Dawn Song, Towards Synthesizing Complex Programs from Input-Output Examples, ICLR 2018.[3] Xinyun Chen, Chang Liu, Dawn Song, Tree-to-tree Neural Networks for Program Translation, ICLR 2018 Workshop.
Overview
27
Program Translation
• Programming languages have rigorous grammars and are not tolerant to typos and grammatical mistakes.
• Each program unambiguously corresponds to a unique parse tree.
• A cross-compiler typically follows a modular procedure to translate the parse trees.
28
Tree-to-tree Neural Network
29
ResultsProgram accuracy of the translation between CoffeeScript to JavaScript, where the CoffeeScript programs are generated using a pCFG-based program generator, and the JavaScript programs are their corresponding translation using the CoffeeScript compiler.
Program accuracy of the translation from Java to C# on real-world projects. This benchmark is also evaluated in previous work on code migration.
[13] Nguyen et al. Divide-and-conquer Approach for Multi-phase Statistical Migration for Source Code (T), ASE 2015.
30
Summary
SpecNatural language descriptionsInput/Output ExamplesReference programs
• Hybrid approaches of combining deep neural networks and symbolic program synthesis techniques are promising directions for exploration.
• Given incomplete or noisy specifications in real-world scenarios, deep learning techniques have the potential to perform program synthesis tasks better than hand-engineered rule-based systems.
• Deep neural networks can be leveraged to facilitate other applications of software engineering, such as code migration, program repair, fuzzing, code optimization, etc.
A good step towards AGI
31
Thanks!
Richard ShinDawn Song Chang Liu