neural architectures with memory - university of illinoisslazebni.cs.illinois.edu › spring17 ›...

89
Neural Architectures with Memory Nitish Gupta, Shreya Rajpal 25 th April, 2017 1

Upload: others

Post on 09-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralArchitectureswithMemory

NitishGupta,ShreyaRajpal25th April,2017

1

Page 2: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

StoryComprehension

2

Joe went to the kitchen. Fred went to the kitchen. Joe picked up the milk. Joe travelled to his office. Joe left the milk. Joe went to the bathroom.

Q1 : Where is Joe?

Q2 : Where is the milk now?

Q3 : Where was Joe before the office?

Questions from Joe’s angry mother:

Page 3: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DialogueSystem

3

Hello! What can I do for you today?

I’d like to reserve a table for 6.

Sure! When would you like that reservation?

At 7 PM, please.

Okay. What cuisine would you like?

Actually make that 7:30 PM

Updated! What cuisine?

Is there anything better than a medium rare steak?

Nothing at all! Blackdog has a 4.7 on Yelp.

Sounds perfect! Also, add one more person.

Reservation done for 7, 7:30pm at Blackdog. Enjoy!

Mac

hine

Human

Page 4: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MLmodelsneedmemory!

DeeperAItasksrequireexplicitmemoryandmulti-hopreasoningoverit

•RNNshaveshortmemory•Cannotincreasememorywithoutincreasingnumberofparameters•Needforcompartmentalizedmemory•Read/Writeshouldbeasynchronous

4

Page 5: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MemoryNetworks(MemNN)

5

• ClassofModelswithmemory𝑚 - Arrayofobjects𝑚"

FourComponents:I- InputFeatureMap:InputmanipulationG- Generalization:MemoryManipulationO- OutputFeatureMap:OutputrepresentationgeneratorR- Response:ResponseGenerator

MemoryNetworks,Westonet.al.,ICLR2015

mi

Eachmemoryhereisadensevector

Page 6: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MemNN

6

1. InputFeatureMap• Imagineinputasasequenceofsentences𝑥"

2. UpdateMemories

MemoryNetworks,Westonet.al.,ICLR2015

xi Input Feature Map I(xi)

I(xi)

Generalization

Page 7: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MemNN

7

3. OutputRepresentation• Sayif𝑞 isaquestion,computeoutputrepresentation

4. GenerateAnswerResponse

MemoryNetworks,Westonet.al.,ICLR2015

Output Feature Map o

Response

o r

I(q)

Page 8: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

SimpleMemNNforText

8

1. InputFeatureMap - Bag-of-Words representation

MemoryNetworks,Westonet.al.,ICLR2015

xi

Sentence

Bag-of-Words

𝐼(𝑥")

Page 9: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

SimpleMemNNforText

9

2. Generalization:Storeinputinnewmemory

MemoryNetworks,Westonet.al.,ICLR2015

mi = I(xi)

Memoriestillnow(i=4) 𝐼(𝑥()

Memoriesafter5inputs

m m

Page 10: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

SimpleMemNNforText

10

3. Output: Using𝑘 = 2memoryhopswithquery𝑥

4. Response - SingleWordAnswer

MemoryNetworks,Westonet.al.,ICLR2015

Scoreallmemoriesagainstinput1st Maxscoring

memoryindex

Scoreallwordsagainstqueryand2supportingmemories

Maxscoringword

i = 1, . . . , N

o1 = O1(x,m) = argmax sO(I(x),mi)

i = 1, . . . , N

o2 = O2(x,m) = argmax sO([I(x),mo1 ],mi)

r = argmax

w2W

s

R

([I(x),mo1 ,mo2 ], w)

Scoreallmemoriesagainstinput&𝑜-

2nd Maxscoringmemoryindex

Page 11: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

ScoringFunction

• ScoringFunctionisanembeddingmodel

11

s(x, y) = x

TU

TUy

• Whatis𝑈𝑦 ?

Word Embedding Matrix

Sum of Word Embeddings

=

Uy = Uy

ScoringFunctionisjustdot-productbetweensumofwordembeddings!!!

MemoryNetworks,Westonet.al.,ICLR2015

Page 12: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

12

MemoryNetworks,Westonet.al.,ICLR2015

Joe went to the kitchen.Fred went to the kitchen.Joe picked up the milk.Joe travelled to his office.Joe left the milk. Joe went to the bathroom.

InputSentences Memories

Where is the milk now?

Question 1st supportingmemory

Where is the milk now?

Question 2nd supportingmemory

Where is the milk now?

Question

Office

Response

Page 13: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

TrainingObjective

13

X

f 6=mo1

max(0, � � sO(x,mo1) + sO(x, f)) +

Scorefortrue1st memory

Scoreforanegativememory

MemoryNetworks,Westonet.al.,ICLR2015

Page 14: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

TrainingObjective

14

X

f 6=mo1

max(0, � � sO(x,mo1) + sO(x, f)) +

X

f 0 6=mo2

max(0, � � sO([x,mo1 ],mo2) + sO([x,mo1 ], f0)) +

Scorefortrue2nd memory

Scoreforanegativememory

MemoryNetworks,Westonet.al.,ICLR2015

Page 15: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

TrainingObjective

15

X

f 6=mo1

max(0, � � sO(x,mo1) + sO(x, f)) +

X

f 0 6=mo2

max(0, � � sO([x,mo1 ],mo2) + sO([x,mo1 ], f0)) +

Scorefortrueresponse

Scoreforanegativeresponse

X

r 6=r

max(0, � � s

R

([x,mo1 ,mo2 ], r) + s

R

([x,mo1 ,mo2 ], r))

MemoryNetworks,Westonet.al.,ICLR2015

Page 16: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

Experiment

16

• Large– ScaleQA• 14MStatements– (subject,relation,object)•MemoryHops;𝑘 = 1• Onlyre-rankedcandidatesfromothersystem

Method F1

Faderet.al.2013 0.54

Bordeset.al. 2014b 0.73

MemoryNetworks (Thiswork) 0.72

WhydoesMemoryNetworkperformexactlyaspreviousmodel?

MemoryNetworks,Westonet.al.,ICLR2015

Storedasmemories

Outputishighestscoringmemory

Page 17: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

Experiment

17

• Large– ScaleQA• 14MStatements– (subject,relation,object)•MemoryHops;𝑘 = 1• Onlyre-rankedcandidatesfromothersystem

Method F1

Faderet.al.2013 0.54

Bordeset.al. 2014b 0.73

MemoryNetworks (Thiswork) 0.72

WhydoesMemoryNetworksnotperformaswell?

Page 18: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

UsefulExperiment

18

• SimulatedWorldQA• 4characters,3objects,5rooms• 7kstatements,3kquestionsfortrainingandsamefortesting• Difficulty1(5)– Entityinquestionismentionedinlast1(5)sentences• For𝑘 = 2,annotationhasintermediatebestmemoriesaswell

MemoryNetworks,Westonet.al.,ICLR2015

Page 19: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

Limitations

19

• SimpleBOWrepresentation

• SimulatedQuestionAnsweringdatasetistootrivial

• Strongsupervisioni.e.forintermediatememoriesisneeded

MemoryNetworks,Westonet.al.,ICLR2015

Page 20: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

End-to-EndMemoryNetworks(MemN2N)

20

•Whatiftheannotationis:• Inputsentences𝑥-, 𝑥2,… , 𝑥4• Query𝑞• Answer𝑎

•Modelperformsby:• Generatingmemoriesfrominputs• Transformingqueryintosuitablerepresentation• Processqueryandmemoriesjointlyusingmultiplehopstoproducetheanswer• Backpropagatethroughthewholeprocedure

End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

Joe went to the kitchen. Fred went to the kitchen. Joe picked up the milk. Joe travelled to his office. Joe left the milk. Joe went to the bathroom.

Where is the milk now?

Office

Page 21: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MemN2N

21

1. Convertinputtomemories𝑥" → 𝑚"

End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

mi = Axi BOWinput

Word-EmbeddingMatrix

Sumofword-embeddings

2. Transformquery𝑞 intosamerepresentationspace

u = Bq

3. OutputVectors𝑥" → 𝑐"ci = Cxi

Page 22: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MemN2N

22

3. Scoringmemoriesagainstquery

End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

Memories

Query(transformed)

Scoreforinput/memory

4. Generateoutput

pi = Softmax(uTmi)

o =X

i

pici

Weightedaverageofallinputs(transformed)

Page 23: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MemN2N

23

5. GeneratingResponse

End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

a = Softmax(W (u+ o))

TrainingObjective– MaximumLikelihood/CrossEntropy

Distributionoverresponsewords

Averaged-outputQuery

ˆ

⇥ = argmax

NX

s=1

logP (as)

Page 24: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

24End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

Generatememories

TransformQuery

Generateoutputs

Scorememories

Makeaveragedoutput

Response

Page 25: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

25End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

Multi-hopMemN2N

u

k+1 = u

k + o

k

Hop1

Hop2

Hop3

DifferentMemoriesandOutputsforeachHop

Page 26: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

Experiments

26End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

• SimulatedWorldQA

• 20TasksfrombAbIdataset- 1Kand10Kinstancespertask

• Vocabulary=177wordsonly!!!!!

• 60epochs

• LearningRateannealing

• LinearStartwithdifferentlearningrate

• “Modeldivergedveryoften,hencetrainedmultiplemodels”

Page 27: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

27End-To-EndMemoryNetworks,Sukhbaataret.al.,NIPS2015

MemNN MemN2N

Error%(1k) 6.7 12.4

Error%(10k) 3.2 7.5

Page 28: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

MovieTriviaTime!

28

• WhichwasStanleyKubricks’sfirstmovie?

• Whendid2001:ASpaceOdysseyrelease?

• AfterTheShining,whichmoviediditsdirectordirect?

FearandDesire

1968

FullMetalJacket

(2001:a_space_odyssey,directed_by,stanley_kubrick)(fear_and_dark, directed_by,stanley_kubrick)

…(fear_and_dark, released_in,1953)

(full_metal_jacket, released_in, 1987)

…(2001:a_space_odyssey, released_in,1968)

…(the_shining, directed_by,stanley_kubrick)

…(AI:artificial_intelligence,written_by,stanley_kubrick)

Subject Relation Object

KnowledgeBase

Page 29: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KnowledgeBase?

29

Incomplete! TooChallenging!

CombineusingMemoryNetworks?

TextualKnowledge?(2001:a_space_odyssey,directed_by,stanley_kubrick)

(fear_and_dark, directed_by,stanley_kubrick)

…(fear_and_dark, released_in,1953)

(full_metal_jacket, released_in, 1987)

…(2001:a_space_odyssey, released_in,1968)

…(the_shining, directed_by,stanley_kubrick)

…(AI:artificial_intelligence,written_by,stanley_kubrick)

Page 30: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

Key-ValueMemNNs forReadingDocuments

30

• StructuredMemoriesasKey-ValuePairs• RegularMemNNs havesinglevectorforeachmemory• Keymorerelatedtoquestionandvaluestoanswer

Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

Memories = (k1, v1), (k2, v2), . . . , (kN , vN )

KeysandValuescanbeWords,Sentences,Vectorsetc.

(𝑘:Kubrick’sfirstmoviewas,𝑣: FearandDark)

Page 31: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN

31Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

1. RetrieverelevantmemoriesusingHashingTechniques

q

(k1, v1)

(k2, v2)

(kN , vN ). . .

(kh1 , vh1)

(kh2 , vh2)

(khN , vhN ). . .

Useinvertedindex,localitysensitivehashing, somethingsensible

AllMemoriesRetrievedRelevantMemories

Page 32: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN

32Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

2. ScoreMemory-Keys

phi = Softmax(A�Q(q) ·A�K(khi))

KeyBOWSumof

Embeddings

Dot-ProdDistributionoverMemory-Keys

3. GenerateOutput

o =X

i

phiA�V (vhi)WeightedaverageofMemory-values

pi = Softmax(uTmi)

o =X

i

pici

Page 33: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN- MultipleHops

33Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

Inthe𝑗<= hop:

Queryrepresentation:

KeyAddressing

qj = Rj(A�Q(qj�1) + o)

GenerateResponse

phi = Softmax(A�Q(qj) ·A�K(khi))

FinalHop

a = Softmax(A�Q(qH+1) ·B�Y (yi))

Page 34: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN– Whattostoreinmemories?

34Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

1. KBBased:

Key:(subject,relation);Value:ObjectK:(2001:a_space_odyssey,directed_by);V:stanley_kubrick

2. DocumentBased

Foreachentityindocument,extract5-wordwindowaroundit

Key:window;Value:Entity

K:screenplaywrittenbyand;V:Hampton

Page 35: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN– Experiments

35Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

• WikiMovies Benchmark• Total100KQA-pairs• 10%fortesting

Method KB Doc

E2EMemoryNetwork 78.5 69.9

Key-ValueMemoryNetwork 93.9 76.2

Page 36: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN

36Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

RetrieverelevantMemories

ScorerelevantMemory-Keys

GenerateOutputusingAveragedMemory-Values

GenerateResponse

Page 37: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

KV-MemNN

37Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

Page 38: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

CNN:ComputerVision ::________:NLP

38Key-ValueMemoryNetworksforDirectlyReadingDocuments,Milleret.al.,EMNLP2016

RNN

Page 39: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DynamicMemoryNetworks– TheBeast

39AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

UseRNNs,specificallyGRUsforeverymodule

Page 40: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

40AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

ct = GRU(wit, c

i�1t )FinalGRUOutput

for𝑡<= sentence

Page 41: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

41AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

q = GRU(qiw, qi�1)

Page 42: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

42AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

ei = hiTC

𝑖 = 1

hit = gitGRU(ct, h

it�1) + (1� git)h

it�1𝐻𝑜𝑝 = 𝑖

Page 43: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

43AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

ei = hiTC

hit = gitGRU(ct, h

it�1) + (1� git)h

it�1

𝑖 = 2

𝐻𝑜𝑝 = 𝑖

Page 44: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

44AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

mi = GRU(ei,mi�1)

Page 45: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

45AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

yt = Softmax(W (a)↵t)

↵t = GRU([yt�1, q],↵t�1)

↵0 = mTm

Page 46: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN

46AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

HowmanyGRUswereusedwith2hops?

Page 47: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DMN– QualitativeResults

47AskMeAnything:DynamicMemoryNetworksforNaturalLanguageProcessing,Kumaret.al.ICML2016

Page 48: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

48

AlgorithmLearning

Page 49: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine

CopyTask:ImplementtheAlgorithm

Givenalistofnumbersatinput,reproducethelistatoutput

NeuralTuringMachineLearns:1. Whattowritetomemory2. Whentowritetomemory3. Whentostopwriting4. Whichmemorycelltoreadfrom5. Howtoconvertresultofreadintofinaloutput

49

Page 50: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines

50

Controller

ExternalInput ExternalOutput

ReadHeads

Memory

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

‘Blurry’

WriteHeads

Page 51: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines

‘Blurry’MemoryAddressing(attimeinstant‘t’)

51

N

M

Mt

wt

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

SoftAttention(Lectures2,3,20,24)

wt(0)=0.1 wt(1)=0.2 wt(2)=0.5 wt(3)=0.1 wt(4)=0.1

Page 52: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines

Moreformally,

BlurryReadOperation

Given:Mt(memorymatrix)ofsizeNxMwt (weightvector)oflengthNt(timeindex)

52NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 53: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:BlurryWrites

BlurryWriteOperationDecomposedintoblurryerase +blurryadd

Given:Mt(memorymatrix)ofsizeNxMwt (weightvector)oflengthNt(timeindex)et(erasevector)oflengthMat(addvector)oflengthM

53

EraseComponent AddComponentNeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 54: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:Erase

54

5 7 9 2 12

11 6 3 1 2

3 7 3 10 6

4 2 5 9 9

3 5 12 8 4

M0

w1(0)=0.1 w1(1)=0.2 w1(2)=0.5 w1(3)=0.1 w1(4)=0.1

1.0

0.7

0.2

0.5

0.0

e1

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

1xN Mx1

Page 55: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:Erase

55

4.5 5.6 4.5 1.8 10.8

10.23 5.16 1.95 0.93 1.86

2.94 6.72 2.7 9.8 5.88

3.8 1.8 3.75 8.55 8.55

3 5 12 8 4

w1(0)=0.1 w1(1)=0.2 w1(2)=0.5 w1(3)=0.1 w1(4)=0.1

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 56: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:Addition

56

4.5 5.6 4.5 1.8 10.8

10.23 5.16 1.95 0.93 1.86

2.94 6.72 2.7 9.8 5.88

3.8 1.8 3.75 8.55 8.55

3 5 12 8 4

w1(0)=0.1 w1(1)=0.2 w1(2)=0.5 w1(3)=0.1 w1(4)=0.1

3

4

-2

0

2

a1

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 57: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:BlurryWrites

57

4.8 6.2 6 2.1 11.1

10.63 5.96 3.95 1.33 2.26

2.74 6.32 1.7 9.6 5.68

3.8 1.8 3.75 8.55 8.55

3.2 5.4 13 8.2 4.2

M1

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 58: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:Demo

58

Demonstration:TrainingonCopyTask

FigurefromSnipsAI'sMediumPost

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 59: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:AttentionModel

59NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Generatingwt

ContentBased

Example:QATask

- ScoresentencesbysimilaritywithQuestion

- Weightsassoftmax ofsimilarityscores

LocationBased

Example:CopyTask

- Movetoaddress(i+1)afterwritingtoindex(i)

- Weights≈Transitionprobabilities

Page 60: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

60

CA

Stepsforgeneratingwt1. ContentAddressing2. Peaking3. Interpolation4. ConvolutionalShift(Location

Addressing)5. Sharpening

I

CS

S

Prev.State

ControllerOutputs

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 61: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

Prev.State

ControllerOutputs

61

:Vector(lengthM)producedbyController

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 62: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

62

CA

Step1:ContentAddressing(CA)

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Prev.State

ControllerOutputs

Page 63: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

63

CA

Step2:PeakingPrev.State

ControllerOutputs

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 64: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

64

CA

Step3:Interpolation(I)

I

Prev.State

ControllerOutputs

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 65: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

65

CA

Step4:ConvolutionalShift(CS)• Controlleroutputs,anormalized

distributionoverallNpossibleshifts• Rotation-shiftedweightscomputedas:

I

CS

Prev.State

ControllerOutputs

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 66: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:AttentionModel

66

CA

Step5:Sharpening(S)• Usestosharpenas:

I

CS

S

Prev.State

ControllerOutputs

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 67: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:ControllerDesign

67

• Feed-forward:faster,moretransparency&interpretabilityaboutfunctionlearnt

• LSTM:moreexpressivepower,doesn’tlimitthenumberofcomputationspertimestep

Bothareend-to-enddifferentiable!1. Reading/Writing->ConvexSums2. wt generation->Smooth3. ControllerNetworks

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 68: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachine:NetworkOverview

68

UnrolledFeed-forwardController

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

FigurefromSnipsAI'sMediumPost

Page 69: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachinesvs.MemNNs

MemNNs•Memoryisstatic,withfocusonretrieving(reading)informationfrommemory

NTMs•Memoryiscontinuouslywrittentoandreadfrom,withnetworklearningwhentoperformmemoryreadandwrite

69

Page 70: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:Experiments

TaskNetworkSize NumberofParameters

NTMw/LSTM* LSTM NTMw/LSTM LSTM

Copy 3x100 3x256 67K 1.3M

RepeatCopy 3x100 3x512 66K 5.3M

Associative 3x100 3x256 70K 1.3M

N-grams 3x100 3x128 61K 330K

PrioritySort 2x100 3x128 269K 385K

70NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 71: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:‘Copy’LearningCurve

Trainedon8-bitsequences,1<=sequencelength<=20

71NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 72: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachines:‘Copy’Performance

72

LSTM

NTM

NeuralTuringMachines,Graveset.al.,arXiv:1410.5401

Page 73: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

NeuralTuringMachinestriggeredanoutbreakofMemory

Architectures!

73

Page 74: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

StackAugmentedRecurrentNetworks

Learnalgorithmsbasedonstackimplementations(e.g.learningfixedsequencegenerators)

Usesastackdatastructuretostorememory(asopposedtoamemorymatrix)

74InferringAlgorithmicPatternswithStack-AugmentedRecurrentNets,Joulin et.al.,arXiv:1503.01007

Page 75: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DynamicNeuralTuringMachines

Experimentedwithaddressingschemes• DynamicAddresses:Addressesofmemorylocationslearntintraining– allowsnon-linearlocation-basedaddressing

• Leastrecentlyusedweighting:Preferleastrecentlyusedmemorylocations+interpolatewithcontent-basedaddressing

• DiscreteAddressing:Samplethememorylocationfromthecontent-baseddistributiontoobtainaone-hotaddress

• Multi-stepAddressing:Allowsmultiplehopsovermemory

Results:bAbI QATask

75DynamicNeuralTuringMachinewithSoftandHardAddressingSchemes,Gulchere et.al.,arXiv:1607.00036

Location NTM Content NTM Soft DNTM DiscreteDNTM

1-step 31.4% 33.6% 29.5% 27.9%

3-step 32.8% 32.7% 24.2% 21.7%

Page 76: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

StackAugmentedRecurrentNetworks

•Blurry‘push’and‘pop’onstack.E.g.:

• Someresults:

76InferringAlgorithmicPatternswithStack-AugmentedRecurrentNets,Joulin et.al.,arXiv:1503.01007

Page 77: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DifferentiableNeuralComputers

Advancedaddressingmechanisms:

• ContentBasedAddressing

• TemporalAddressing• Maintainsnotionofsequenceinaddressing• TemporalLinkMatrixL(sizeNxN),L[i,j]=degreetowhichlocation Iwaswrittentoafterlocation j.

• UsageBasedAddressing

77Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 78: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:UsageBasedAddressing

•Writingincreasesusageofcell,readingdecreasesusageofcell

• Leastusedlocationhashighestusage-basedweighting

• Interpolateb/wusage&contentbasedweightsforfinalwriteweights

78Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 79: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Example

79Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 80: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:ImprovementsoverNTMs

NTM

• Largecontiguousblocksofmemoryneeded

• Nowaytofreeupmemorycellsafterwriting

DNC

•Memorylocationsnon-contiguous,usage-based

• Regularde-allotmentbasedonusage-tracking

80Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 81: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

GraphTasks

GraphRepresentation:(source,edge,destination)tuples

Typesoftasks:

- Traversal:Performwalkongraphgivensource,listofedges

- ShortestPath:Givensource,destination

- Inference:Givensource,relationoveredges;finddestination

81Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 82: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

GraphTasks

Trainingover3phases:

• Graphdescriptionphase:(source,edge,destination)tuplesfedintothegraph

• Queryphase:Shortestpath(source,____,destination),Inference(source,hybridrelation,___),Traversal(source,relation,relation…,___)

• Answerphase:Targetresponsesprovidedatoutput

Trainedonrandomgraphsofmaximumsize1000

82Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 83: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

GraphTasks:LondonUnderground

83Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 84: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

GraphTasks:LondonUnderground

84Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

InputPhase

Page 85: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

GraphTasks:LondonUnderground

85Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

TraversalTask

QueryPhase AnswerPhase

Page 86: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

GraphTasks:LondonUnderground

86Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

ShortestPathTask

QueryPhase AnswerPhase

Page 87: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

DNC:Experiments

87Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

GraphTasks:Freya’sFamilyTree

Page 88: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

Conclusion

•MachineLearningmodelsrequirememoryandmulti-hopreasoningtoperformAItasksbetter

•MemoryNetworksforTextareaninterestingdirectionbutverysimple

• Genericarchitectureswithmemory,suchasNeuralTuringMachine,limitedapplications shown

• FuturedirectionsshouldbefocusingonapplyinggenericneuralmodelswithmemorytomoreAITasks.

88Hybridcomputingusinganeuralnetworkwithdynamicexternalmemory,Graveset.al.,Naturevol.538

Page 89: Neural Architectures with Memory - University Of Illinoisslazebni.cs.illinois.edu › spring17 › lec27_memory.pdf · Hello! What can I do for you today? I’d like to reserve a

ReadingList

• KarolKurach,MarcinAndrychowicz &IlyaSutskever NeuralRandom-AccessMachines,ICLR,2016

• EmilioParisotto &Ruslan Salakhutdinov NeuralMap:StructuredMemoryforDeepReinforcementLearning,ArXiv,2017

• Pritzel et.al. NeuralEpisodicControl,ArXiv,2017• OriolVinyals,Meire Fortunato, Navdeep Jaitly PointerNetworks,ArXiv,2017

• JackWRaeetal.,ScalingMemory-AugmentedNeuralNetworkswithSparseReadsandWrites,ArXiv 2016

• AntoineBordes, Y-LanBoureau, JasonWeston,LearningEnd-to-EndGoal-OrientedDialog,ICLR2017

• JunhyukOh, ValliappaChockalingam, SatinderSingh, HonglakLee,ControlofMemory,ActivePerception,andActioninMinecraft,ICML2016

• WojciechZaremba, IlyaSutskever,ReinforcementLearningNeuralTuringMachines,ArXiv 2016

89