learning programs: a hierarchical bayesian approachpliang/papers/programs-icml2010-talk.pdf ·...
TRANSCRIPT
![Page 1: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/1.jpg)
Learning Programs:
A Hierarchical Bayesian Approach
ICML - Haifa, Israel June 24, 2010
Percy Liang Michael I. Jordan Dan Klein
![Page 2: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/2.jpg)
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
2
![Page 3: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/3.jpg)
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
Goal: Programming by Demonstration
If the user demonstrates italicizing the first occurrence,can we generalize to the remaining?
2
![Page 4: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/4.jpg)
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
Goal: Programming by Demonstration
If the user demonstrates italicizing the first occurrence,can we generalize to the remaining?
Solution: represent task by a program to be learned
1. Move to next occurrence of word with prefix program2. Insert <i>3. Move to end of word4. Insert </i>
2
![Page 5: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/5.jpg)
Motivating Application: Repetitive Text Editing
I like programs, but I wish programswould just program themselves sinceI don’t like programming. =⇒
I like <i>programs</i>, but I wish <i>programs</i>would just <i>program</i> themselves sinceI don’t like <i>programming</i>.
Goal: Programming by Demonstration
If the user demonstrates italicizing the first occurrence,can we generalize to the remaining?
Solution: represent task by a program to be learned
1. Move to next occurrence of word with prefix program2. Insert <i>3. Move to end of word4. Insert </i>
Challenge: learn from very few examples2
![Page 6: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/6.jpg)
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)
Training data
3
![Page 7: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/7.jpg)
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)=⇒ Z such that (Z Xj) = Yj
Training data Consistent program
3
![Page 8: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/8.jpg)
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)=⇒ Z such that (Z Xj) = Yj
Training data Consistent program
Challenge:
When n small, many programs consistent with training data.
I like <i>programs</i>, but I wish programswould just program themselves sinceI don’t like programming.
Move to beginning of third word, ...Move to beginning of word after like, ...Move 7 spaces to the right, ...Move to word with prefix program, ...· · ·
3
![Page 9: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/9.jpg)
General Setup
Goal:
(X1, Y1)· · ·
(Xn, Yn)=⇒ Z such that (Z Xj) = Yj
Training data Consistent program
Challenge:
When n small, many programs consistent with training data.
I like <i>programs</i>, but I wish programswould just program themselves sinceI don’t like programming.
Move to beginning of third word, ...Move to beginning of word after like, ...Move 7 spaces to the right, ...Move to word with prefix program, ...· · ·
Which program to choose?3
![Page 10: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/10.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
4
![Page 11: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/11.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)?
4
![Page 12: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/12.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
4
![Page 13: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/13.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
4
![Page 14: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/14.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
Find programs that share common subprograms.
4
![Page 15: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/15.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
Find programs that share common subprograms.
• Programs do tend to share common components.
4
![Page 16: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/16.jpg)
Key Intuition
One task:
Want to choose a program which is simple (Occam’s razor).
Examples =⇒ Z
What’s the right complexity metric (prior)? No general answer.
Multiple tasks:
Task 1 examples =⇒ Z1
· · · · · ·Task K examples =⇒ ZK
Find programs that share common subprograms.
• Programs do tend to share common components.
• Penalize joint complexity of all K programs.
4
![Page 17: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/17.jpg)
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
5
![Page 18: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/18.jpg)
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
5
![Page 19: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/19.jpg)
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
5
![Page 20: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/20.jpg)
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
6
![Page 21: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/21.jpg)
Representation: What Language?
Goal: allow sharing of subprograms
7
![Page 22: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/22.jpg)
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
7
![Page 23: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/23.jpg)
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
+ higher-order combinators (new)
+ routing intuition, visual representation (new)
7
![Page 24: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/24.jpg)
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
+ higher-order combinators (new)
+ routing intuition, visual representation (new)
Properties: no mutation, no variables ⇒ simple semantics
7
![Page 25: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/25.jpg)
Representation: What Language?
Goal: allow sharing of subprograms
Our language:
Combinatory logic [Schonfinkel, 1924]
+ higher-order combinators (new)
+ routing intuition, visual representation (new)
Properties: no mutation, no variables ⇒ simple semantics
Result:
• Programs are trees
• Subprograms are subtrees
7
![Page 26: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/26.jpg)
Programs with No Arguments
Example: compute min(3, 4)
8
![Page 27: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/27.jpg)
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
8
![Page 28: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/28.jpg)
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
if
< 34
34
8
![Page 29: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/29.jpg)
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
if
< 34
34
General:
x y⇒ result of applying function x to argument y
8
![Page 30: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/30.jpg)
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4)
if
< 34
34
General:
x y⇒ result of applying function x to argument y
Arguments are curried
8
![Page 31: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/31.jpg)
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4) (if true 3 4)
if
< 34
34⇒
if true3
4
General:
x y⇒ result of applying function x to argument y
Arguments are curried
8
![Page 32: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/32.jpg)
Programs with No Arguments
Example: compute min(3, 4)(if (< 3 4) 3 4) (if true 3 4)
if
< 34
34⇒
if true3
4 ⇒ 3
General:
x y⇒ result of applying function x to argument y
Arguments are curried
8
![Page 33: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/33.jpg)
Programs with One Argument
Example: x 7→ x2 + 1
9
![Page 34: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/34.jpg)
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
Lambda calculus
9
![Page 35: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/35.jpg)
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
9
![Page 36: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/36.jpg)
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
Intuition:
Combinators {B,C,S, I} encode placement of arguments
9
![Page 37: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/37.jpg)
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
Intuition:
Combinators {B,C,S, I} encode placement of arguments
Semantics:
x y
r⇔ (r x y)
r ∈ {B,C,S, I} 9
![Page 38: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/38.jpg)
Programs with One Argument
Example: x 7→ x2 + 1
λx . +
∗ xx
1
+
∗ I
B I
S
B 1C
Lambda calculus Combinatory logic
Intuition:
Combinators {B,C,S, I} encode placement of arguments
Semantics:
x y
r⇔ (r x y)
r ∈ {B,C,S, I}
Rules:(B f g x) = (f (g x))... 9
![Page 39: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/39.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
10
![Page 40: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/40.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 1C 5
10
![Page 41: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/41.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 1C 5
x yC z ⇔
x zy
route left
10
![Page 42: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/42.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 51 x y
C z ⇔x z
y
route left
10
![Page 43: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/43.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S
B 51 x y
C z ⇔x z
y
route left
x yB z ⇔ x
y z
route right
10
![Page 44: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/44.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
10
![Page 45: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/45.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B I
S 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right10
![Page 46: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/46.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B 5 I 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right10
![Page 47: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/47.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B 5 I 5
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right
I x ⇔ x
stop10
![Page 48: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/48.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ I
B 55
1 x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right
I x ⇔ x
stop10
![Page 49: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/49.jpg)
Programs with One Argument
Example: Apply x 7→ x2 + 1 to 5
+
∗ 55
1
x yC z ⇔
x zy
route left
x yB z ⇔ x
y z
route right
x yS z ⇔
x z y z
route left and right
I x ⇔ x
stop10
![Page 50: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/50.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)
11
![Page 51: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/51.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
11
![Page 52: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/52.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
11
![Page 53: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/53.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if <
BB I
SC I
CS
11
![Page 54: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/54.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if <
BB I
SC I
CS 34
11
![Page 55: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/55.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if <
BB I
SC I
CS 34
11
![Page 56: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/56.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if< 3
B 3C I
S 4
11
![Page 57: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/57.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if< 3
B 3C I
S 4
11
![Page 58: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/58.jpg)
Programs with Multiple Arguments
Example: (x, y) 7→ min(x, y)Classical: first-order combinators {B,C,S, I}
Complete basis, so can implement min, but cumbersome
New: higher-order combinators {B,C,S, I}∗
Infinite basis, but resulting programs are more intuitive
e.g., CS routes 1st arg. left, 2nd arg. left and right
if
< 34
34
11
![Page 59: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/59.jpg)
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
12
![Page 60: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/60.jpg)
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
No significant sharing of subtrees (subprograms)
12
![Page 61: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/61.jpg)
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
No significant sharing of subtrees (subprograms)
Refactored:
if I
BBB I
CSC I
CCS <
if I
BBB I
CSC I
CCS >
12
![Page 62: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/62.jpg)
Using Combinators for Refactoring
min max
if <
BB I
SC I
CS
if >
BB I
SC I
CS
No significant sharing of subtrees (subprograms)
Refactored:
if I
BBB I
CSC I
CCS <
if I
BBB I
CSC I
CCS >
Fruitful sharing of subtrees (subprograms)12
![Page 63: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/63.jpg)
Summary
Introduced new combinatory logic basis (intuition: routing)
13
![Page 64: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/64.jpg)
Summary
Introduced new combinatory logic basis (intuition: routing)
Purpose of these combinators:
• Represent multi-argument functions
• Allow refactoring to expose common substructures
13
![Page 65: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/65.jpg)
Summary
Introduced new combinatory logic basis (intuition: routing)
Purpose of these combinators:
• Represent multi-argument functions
• Allow refactoring to expose common substructures
Achieved uniformity: Every subtree is a subprogram
13
![Page 66: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/66.jpg)
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
14
![Page 67: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/67.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]
15
![Page 68: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/68.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
15
![Page 69: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/69.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)
15
![Page 70: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/70.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)
15
![Page 71: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/71.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)
15
![Page 72: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/72.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
15
![Page 73: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/73.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
Example:
GenIndep(int→ int) =⇒ + 1
15
![Page 74: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/74.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
Example:
GenIndep(int→ int) =⇒ ∗
− 31
15
![Page 75: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/75.jpg)
Probabilistic Context-Free Grammars
GenIndep(t): [returns a combinator of type t]With probability λ0:
Return a random primitive combinator (e.g., +, 3, I)Else:
Choose a type sx← GenIndep(s→ t)y ← GenIndep(s)return (x, y)
Example:
GenIndep(int→ int) =⇒ ∗
− 31
Problem: No encouragement to share subprograms
15
![Page 76: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/76.jpg)
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators]
16
![Page 77: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/77.jpg)
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
16
![Page 78: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/78.jpg)
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
16
![Page 79: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/79.jpg)
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
With probability λ0:Return∗ a random primitive combinator (e.g., +, 3, I)
Else:Choose a type sx← GenCache(s→ t)y ← GenCache(s)Return∗ (x, y)
Else:
16
![Page 80: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/80.jpg)
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
With probability λ0:Return∗ a random primitive combinator (e.g., +, 3, I)
Else:Choose a type sx← GenCache(s→ t)y ← GenCache(s)Return∗ (x, y)
Else:
Return∗ z ∈ Ct with probability Mz−d|Ct|−Ntd
16
![Page 81: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/81.jpg)
Adaptor Grammars [Johnson, 2007]
Ct ← [ ] for each type t [cached list of combinators](notation: return∗ c adds c to Ct and returns c)
GenCache(t): [returns a combinator of type t]
With probability α0+Ntdα0+|Ct|
:
With probability λ0:Return∗ a random primitive combinator (e.g., +, 3, I)
Else:Choose a type sx← GenCache(s→ t)y ← GenCache(s)Return∗ (x, y)
Else:
Return∗ z ∈ Ct with probability Mz−d|Ct|−Ntd
Interpretation of cache Ct: library of generally useful(unnamed) subroutines which are reused.
16
![Page 82: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/82.jpg)
Outline of Proposed Solution
Program representation: What are subprograms?
Combinatory logic +
∗ I
B I
S
B 1C
Probabilistic model: Which programs are favorable?
Nonparametric Bayesα0
p0
{Gt} Zi Yij Xij
nK
Statistical inference: How do we search for good programs?
MCMCx y
r1Br z
r0
⇒ x
y z
r′1
r′0r
17
![Page 83: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/83.jpg)
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
18
![Page 84: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/84.jpg)
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
Use Metropolis-Hastings
Proposal: sample a random program transformation
18
![Page 85: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/85.jpg)
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
Use Metropolis-Hastings
Proposal: sample a random program transformation
Program transformations maintain invariant that
program is correct (likelihood is 1)
18
![Page 86: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/86.jpg)
Inference via MCMC
User provides tree structure that encodes set of programs U
Objective: sample from posterior given program in U
Use Metropolis-Hastings
Proposal: sample a random program transformation
Program transformations maintain invariant that
program is correct (likelihood is 1)
Two types of transformations:
1. Switching
2. Refactoring
18
![Page 87: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/87.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
19
![Page 88: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/88.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)}
19
![Page 89: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/89.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
19
![Page 90: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/90.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
19
![Page 91: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/91.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
19
![Page 92: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/92.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
19
![Page 93: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/93.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
∗+ 2
S
[x 7→ x(x + 2)]
19
![Page 94: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/94.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
∗+ 2
S
[x 7→ x(x + 2)]
⇔ ∗ +BS 2
[x 7→ x(x + 2)]
19
![Page 95: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/95.jpg)
Program transformations (MCMC moves)
Switching: Change content, preserve empirical semantics
Data: {(2, 8)} ∗+ 2
S
[x 7→ x(x + 2)]
⇔∗∗ I
S
S
[x 7→ x3]
Purpose: change generalization
Refactoring: Change form, preserve total semantics
∗+ 2
S
[x 7→ x(x + 2)]
⇔ ∗ +BS 2
[x 7→ x(x + 2)]
Purpose: expose different subprograms for sharing19
![Page 96: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/96.jpg)
Text Editing Experiments
Setup:
Dataset of [Lau et al., 2003]K = 24 tasksEach task: train on 2–5 examples, test on u 13 examples10 random trials
20
![Page 97: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/97.jpg)
Text Editing Experiments
Setup:
Dataset of [Lau et al., 2003]K = 24 tasksEach task: train on 2–5 examples, test on u 13 examples10 random trials
Example task:
Cardinals 5, Pirates 2.⇒
GameScore[ winner ’Cardinals’; loser ’Pirates’; scores [5, 2]].
20
![Page 98: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/98.jpg)
Experimental Results
Uniform prior
Independent prior
Joint prior
21
![Page 99: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/99.jpg)
Experimental Results
2 3 4 5
5
10
15
20
25
erro
r
Uniform prior
Independent prior
Joint prior
21
![Page 100: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/100.jpg)
Experimental Results
2 3 4 5
5
10
15
20
25
erro
r
Uniform prior
Independent prior
Joint prior
Observations:
• Independent prior is even worse than uniform prior
21
![Page 101: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/101.jpg)
Experimental Results
2 3 4 5
5
10
15
20
25
erro
r
Uniform prior
Independent prior
Joint prior
Observations:
• Independent prior is even worse than uniform prior
• Joint prior (multi-task learning) is effective
21
![Page 102: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/102.jpg)
Summary
X ⇒ ⇒ Y
22
![Page 103: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/103.jpg)
Summary
X ⇒ program ⇒ Y
22
![Page 104: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/104.jpg)
Summary
X ⇒ program ⇒ Y
Key challenge: learn programs from few examples
22
![Page 105: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/105.jpg)
Summary
X ⇒ program ⇒ Y
Key challenge: learn programs from few examples
Main idea: share subprograms across multiple tasks
22
![Page 106: Learning Programs: A Hierarchical Bayesian Approachpliang/papers/programs-icml2010-talk.pdf · would just program themselves since I don't like programming](https://reader033.vdocument.in/reader033/viewer/2022042420/5f37024eebcb8963f605579c/html5/thumbnails/106.jpg)
Summary
X ⇒ program ⇒ Y
Key challenge: learn programs from few examples
Main idea: share subprograms across multiple tasks
Tools:
• Combinatory logic: expose subprograms to be shared
• Adaptor grammars: encourage sharing of subprograms
•Metropolis-Hastings: proposals are program transformations
22