first part: deep learning for speech recognition · 2017. 4. 4. · a guy on a skate board on the...

77

Upload: others

Post on 07-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 2: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

First part: Deep Learning for Speech recognition

Page 3: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Deep Speech

Page 4: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 5: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 6: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 7: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 8: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 9: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 10: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 11: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Acoustic model:Hidden Markov Model

Page 12: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 13: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 14: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Decoding:Viterbi algorithm

Page 15: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 16: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 17: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 18: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 19: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Why DL now?

Page 20: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 21: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 22: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 23: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 24: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 25: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 26: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Sequence 2 Sequence

Page 27: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 28: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 29: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 30: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 31: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 32: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 33: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 34: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 35: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 36: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 37: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 38: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 39: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 40: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 41: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 42: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Compressing Neural Nets

Page 43: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

ICLR 2016 Best Paper award

Page 44: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Deep Compression results

Page 45: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 46: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

ImageNet Results

Page 47: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

The First Deep Topologies

Page 48: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Deep-Compression algorithm

Page 49: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Pruningconnections with weights below a threshold are removed

Page 50: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Retrain

Page 51: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 52: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Convolution as Matrix multiplication

Page 53: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 54: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 55: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Weight Sharing & Quantization

Page 56: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 57: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Huffman Coding

Page 58: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 59: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

In Summary

Page 60: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

The main component is Pruning We can replace it with SVD:

Page 61: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Singular Value Decomposition

Page 62: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Splitting one layer into 3 layers

Original # parameters=m*nNew # parameters= m*r+r^2 +r*n

Page 63: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

2015

Page 64: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Squeeze-net (2015)• Decompose the filters layers into smaller filters

• Instead of having 7x3x3 (147) parameters we have 3x1x1+4x1x1+4x3x3 (43)

Page 65: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Squeeze-net topology:late maxpooling

Page 66: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Squeeze-net topology:Deep Compression

Page 67: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Squueze-net Results

Page 68: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

What about new topologies?

Page 69: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Compression barely works..

Page 70: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Why it isn’t working?The convolutional layers already use the “Squeeze-net “ trick

Page 71: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))
Page 72: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Does compression has a value?:Weight Sharing & Quantization still work...

Page 73: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Does compression has a value? Detection

Page 74: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Does compression has a value?Segmentation

Page 75: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Does compression has a value? Image Captioning

Page 76: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Why shallow networks outperform the deeper ones?

Page 77: First part: Deep Learning for Speech recognition · 2017. 4. 4. · a guy on a skate board on the side of a ramp [bear (0.71)] (elephant (0.99jl [brown (0.6B)) [baby [laying (0.61))

Promising Directions