in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... ·...
TRANSCRIPT
![Page 1: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/1.jpg)
Compositional generalizationin artificial neural networks
and humans
Marco Baroni
Facebook AI Research
![Page 2: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/2.jpg)
Outline
• Recurrent neural networks
• A compositional challenge for recurrent neural networks
• How do humans dax twice?
1
![Page 3: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/3.jpg)
2
Brenden LakeFAIRNYU
João LoulaFAIRMIT
Tal LinzenJHU
![Page 4: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/4.jpg)
Artificial neural networks
input output
3
![Page 5: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/5.jpg)
Artificial neural networks
yorkshireterrier
4
![Page 6: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/6.jpg)
Artificial neural networks
yorkshireterrier
“training” consists in optimally setting
network weights to produce right output for
each example input
?
?
??
??
5
![Page 7: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/7.jpg)
The generality of neural networks
6
input output
I: images, O: object labelsI: documents, O: topics
I: pictures of cars, O: voting preferences …
training agnostic to nature of input and output
![Page 8: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/8.jpg)
Taking time into account with recurrent connections
7
externalinput
output
state of the network atthe previous time step
![Page 9: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/9.jpg)
Recurrent neural networksThe "unfolded" view
8
i4 i5 i6
o2
…
i1 i2 i3
o3o1 o4 o5 o6
time
recurrentconnections
Modern RNNs (e.g., LSTMs) possess gating mechanism that improve temporal information flow
![Page 10: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/10.jpg)
The generality of recurrent neural networks
9
i4 i5 i6
o2
…
i1 i2 i3
o3o1 o4 o5 o6
I: English sentences, O: French sentencesI: linguistic instructions, O: action sequences
I: video game states, O: next actions …
![Page 11: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/11.jpg)
Are we on the verge of general machine intelligence?
10
Lake et al. 2018
![Page 12: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/12.jpg)
When are humans fast at learning?
•When evolution has done the slow learning work for us• Perception and categorization, naïve physics and
psychology, motor skills, core language faculties, reasoning...
•When new problems can be solved by combining old tricks (compositionality)
11
![Page 13: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/13.jpg)
Compositional reasoning in 4-year olds
12Piantadosi and Aslin 2016
Screen combination seen in test phase only
![Page 14: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/14.jpg)
Outline
• Recurrent neural networks
• A compositional challenge for recurrent neural networks
• How do humans dax twice?
13
![Page 15: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/15.jpg)
14
• Brenden Lake and Marco Baroni. Generalization without systematicity: On the compositional skills of sequence-to-sequence recurrent networks. ICML 2018
• The SCAN challenge: https://github.com/brendenlake/SCAN/
Lots of earlier work on neural networks and compositionality, main novelty here is that we test latest-generation, state-of-the-art architectures!
![Page 16: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/16.jpg)
Systematic compositionalityFodor and Pylyshyn 1988, Marcus 2003, 2018...
• Walk• Walk twice• Run• Run twice
15
![Page 17: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/17.jpg)
Systematic compositionalityFodor and Pylyshyn 1988, Marcus 2003, 2018...
• Walk• Walk twice• Run• Run twice• Dax
16
![Page 18: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/18.jpg)
Systematic compositionalityFodor and Pylyshyn 1988, Marcus 2003, 2018...
• Walk• Walk twice• Run• Run twice• Dax• Dax twice
17
![Page 19: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/19.jpg)
Systematic compositionalityFodor and Pylyshyn 1988, Marcus 2003, 2018...
• Walk• Walk twice• Run• Run twice• Dax• Dax twice
18
[[X twice]] = [[X]][[X]][[dax]] = perform daxing action
... or perhaps meanings include algorithmic components such as: for (c=0,c<3,c++) {perform X}
![Page 20: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/20.jpg)
Systematic compositionalityin a simple grounded environment
19
walk and turn left!WALK
LTURN
![Page 21: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/21.jpg)
Testing generalization
20
walk and jump leftWALK LTURN JUMP
look right andwalk left
RTURN LOOKLTURN WALK
run around rightRTURN RUN RTURN RUN RTURN RUN RTURN RUN
walkWALK
jump after walkWALK JUMP
run thriceRUN RUN RUN
walk and runRUN WALK
jump aroundand run
TRAINING PHASE TEST TIME
![Page 22: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/22.jpg)
Sequence-to-sequence RNNsfor SCAN
jump
WALK
twice and walk <EOS>
JUMP JUMP
<SOS> JUMP JUMP WALK
<EOS>
21
![Page 23: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/23.jpg)
General methodology
• Train sequence-to-sequence RNN on 100k commands and corresponding action sequences
• At test time, only new composed commands presented
• Each test command presented once
• RNN must generate right action sequence at first try
• Training details: ADAM optimization with 0.001 learning rate and 50% teacher forcing
• Best model overall:• 2-layer LSTM with 200 hidden units per layer, no attention, 0.5 dropout 22
![Page 24: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/24.jpg)
Experiment 1: random train/test split
• Included in training tasks:• look around left twice• look around left twice and turn left• jump right twice• run twice and jump right twice
• Presented during testing:• look around left twice and jump right twice
23
![Page 25: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/25.jpg)
Random train/test split results
24
![Page 26: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/26.jpg)
Experiment 2: split by action length
• Train on commands requiring shorter action sequences (up to 22 actions)• jump around left twice (16 actions)• walk opposite right thrice (9 actions)• jump around left twice and walk opposite right twice (22 actions)
• Test on commands requiring longer actions sequences (from 24 to 48 actions)• jump around left twice and walk opposite right thrice (25 actions)
25
A grammar must reflect and explain the ability of a speaker to produce and understand new sentences which may be longer than any he has previously heard (Chomsky 1956)
![Page 27: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/27.jpg)
Length split results
26
![Page 28: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/28.jpg)
Experiment 3: generalizing composition of a primitive command (the "dax" experiment)
• Training set contains all possible commands with "run", "walk", look", "turn left", "turn right":• "run", "run twice", "turn left and run opposite thrice", "walk after run", ...
• but only a small set of composed "jump" commands:• "jump", "jump left", "run and jump", "jump around twice"
• System tested on all remaining "jump" commands:• jump twice• jump left and run opposite thrice• walk after jump• ...
27
![Page 29: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/29.jpg)
Composed-"jump" split results
28
RNN correctly executes "jump and run opposite right" but not, e.g., "jump and run"
![Page 30: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/30.jpg)
Experiment 4: generalizing the composition of familiar modifiers
• Training set includes all commands except those containing the around right combination:• "run", "run around left", "jump right and run around left thrice", "walk right
after jump left", ...• System tested on around right commands:• run around right• jump left and walk around right• ...
• Also less challenging splits in which all X around right commands are added to training set for 1, 2, 3 distinct fillers (verbs)
29João Loula, Marco Baroni and Brenden Lake. Rearranging the familiar: Testing compositional generalization in recurrent networks. Blackbox WS at EMNLP 2018.
![Page 31: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/31.jpg)
"Around right"-split results
30
![Page 32: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/32.jpg)
Seq2seq models: conclusions
• State-of-the-art "Seq2Seq" Recurrent Neural Networks achieve considerable degree of generalization (Exp 1)...
• ... but this generalization does not appear to be "systematically compositional" in the Fodorian sense (Exps 2-4)
31
![Page 33: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/33.jpg)
Outline
• Recurrent neural networks
• A compositional challenge for recurrent neural networks
• How do humans dax twice?
32
![Page 34: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/34.jpg)
33
zup wif blicket
dax
zup
blicket
tufa
blicket wif dax
dax wif tufa
TRAINING
TEST
![Page 35: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/35.jpg)
34
zup wif blicket
dax
zup
blicket
tufa
blicket wif dax
dax wif tufa
TRAINING
TEST
![Page 36: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/36.jpg)
Lessons learned
•Confirmed that humans are fast learners, up to the ability to combine two functional elements zero-shot
•However, they need full access to training set while solving the task, incremental curriculum and performance is not at 100%
• Systematic biases emerge in error patterns35
![Page 37: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/37.jpg)
36
![Page 38: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/38.jpg)
37
85% accuracy (N=25)
76% accuracy (N=20)
![Page 39: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/39.jpg)
Systematic biases in errors
38
zup wif blicket
dax
zup
blicket
tufa
blicket wif dax
TRAININGdax wif tufa
EXPECTED
dax wif tufa
ICONIC CONCATENATION
dax wif tufaONE-TO-ONE
![Page 40: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/40.jpg)
"Blank state" experiments (subjects N = 29)
39
POOL:
STIMULI: fep fep fepzup fep fep wif
fep dax fap kiki dax fepfep dax kiki
![Page 41: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/41.jpg)
(Consistent) iconic concatenation (79.3% of participants)
40
![Page 42: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/42.jpg)
One-to-one mapping (62.1% of participants)
41
![Page 43: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/43.jpg)
Mutual exclusivity(95.7% of consistent participants)
42
![Page 44: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/44.jpg)
58.6% of participants used words consistently and respected all biases
43
![Page 45: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/45.jpg)
Human compositional skills: conclusions
• Humans are much faster at generalize (although they are not perfect composers either)• They display some consistent biases in generalization• Are human biases useful for fast learning?• Can we get neural networks to display the same biases?
44
![Page 46: in artificial neural networks and humansmarcobaroni.org/publications/lectures/marco-horizon... · 2019. 4. 7. · yorkshire terrier 4. Artificial neural networks yorkshire terrier](https://reader034.vdocument.in/reader034/viewer/2022051822/5fecabfb86bf0e08f5057bab/html5/thumbnails/46.jpg)
45
thank yougrazie mille !
thank you !