visualizing and understanding recurrent...

42
Visualizing and Understanding Recurrent Networks Andrej Karpathy, Justin Johnson, Li Fei-Fei Presented by: Ismail

Upload: others

Post on 08-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Visualizing and Understanding Recurrent Networks

Andrej Karpathy, Justin Johnson, Li Fei-Fei

Presented by: Ismail

Page 2: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 3: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 4: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 5: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 6: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 7: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 8: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 9: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 10: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 11: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

LSTM (Long Short-Term Memory)

RNN

Page 12: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 13: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 14: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Let’s pause for a moment...

Page 15: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 16: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 17: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 18: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 19: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 20: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 21: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 22: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Experiments in the paperDataset:

● Leo Tolstoy’s War and Peace(WP) Novel -- 3, 258, 256 characters, K = 87● Linux Kernel(LK) -- 6, 206, 996 characters, K = 101

Training (Cross product of):

● type (LSTM/RNN/GRU)● number of layers (1/2/3)● number of parameters (4 settings)● both datasets (WP & LK)

Page 23: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Experiments in the paper

● depth >= 2 is beneficial● LSTM, GRU >> RNN

Page 24: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Internal Mechanism of LSTMS

Page 25: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 26: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 27: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 28: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 29: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Understanding long range interactions of LSTM

Page 30: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

n-gram vs n-NN

- The best RNN outperforms 20-gram model (WP -- 1.077 vs 1.195; LK -- 0.84 vs 0.889)

Page 31: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Error analysis- A character is error = If the probability assigned to it in previous time-step is <

0.5

Page 32: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Unique errors

Page 33: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

LSTM on “}”

Page 34: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Training dynamics

Page 35: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 36: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Other RNN-based applications

Page 37: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 38: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 39: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 40: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 41: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,
Page 42: Visualizing and Understanding Recurrent Networksweb.cs.ucdavis.edu/~yjlee/teaching/ecs289g-fall2016/ismail2.pdf · [Visualizing and Understanding Recurrent Networks, Andrej Karpathy*,

Feels like it was a RNN day. ;)

Questions?