ConvolutionalNeural
Networks
Overview
CNN use-cases
References
Convolutional Neural NetworksA brief explanation
Centre for Digital Music, Queen Mary University of London, UK
1/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
References
1 OverviewCNNs vs DNNsCNN structuresInside CNNs
2 CNN use-casesImageMusic
3 References
2/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
CNNs: Convolutional Neural Networks
(Deep) Convolutional Neural Networks
deep = cascadedconvolutional = filters
1
2
1cns.org2AlexNet
3/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
CNNs vs. general DNNs
DNNs: fully-connected
3
CNNs: locally-connected and shared
3urlhttp://cs231n.github.io/convolutional-networks/
4/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filteringConvolution
45
Example: 200x200 image 40K hidden units
~2B parameters!!!
- Spatial correlation is local- Waste of resources + we have not enough training samples anyway..
Fully Connected Layer
Ranzato
5/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
48
Convolutional Layer
Share the same parameters across
different locations (assuming input is
stationary):
Convolutions with learned kernels
Ranzato
6/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
7/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
8/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
9/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
10/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
11/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
12/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
13/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
14/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
15/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
16/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
17/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
18/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
19/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
20/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
Ranzato
21/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filtering
Convolutional Layer
RanzatoMathieu et al. “Fast training of CNNs through FFTs” ICLR 2014
22/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Convolution == filteringExample: vertical edge detector
Convolutional Layer
*
-1 0 1
-1 0 1
-1 0 1
Ranzato
=
23/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
CNN structuresConvolutional layers + something else 1
[6]
Many convolutional layers
that learn filters,
and subsampling layers
that reduce sizes and add invariances
24/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
CNN structuresConvolutional layers + something else 2
[1]
Many convolutional layersthat learn filters,
and subsampling layersthat reduce sizes and add invariances
25/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
Hierarchical features
Hierarchical feature learning
Each layer learns features in different levels of hierarchy
High-level features are built on low-level features
E.g.
Layer 1: Edges (low-level, concrete)Layer 2: Simple shapesLayer 3: Complex shapesLayer 4: More complex shapesLayer 5: Shapes of target objects (high-level, abstract)
26/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?in image recognition task
[11]
27/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?in image recognition task
[11]
28/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?in image recognition task
[11]
29/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?1/2 in music genre classification task
Layer 1/5BachOriginal
Dream Toy Eminem
Bach[Feature 1-9], Crude onset detector
Dream Toy Eminem
Bach[Feature 1-27], Onset detector
Dream Toy Eminem
[2]blog demo
30/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?1/2 in music genre classification task
Layer 2/5BachOriginal
Dream Toy Eminem
Bach[Feature 2-0], Good onset detector
Dream Toy Eminem
Bach[Feature 2-1], Bass note selector
Dream Toy Eminem
Bach[Feature 2-10], Harmonic selector
Dream Toy Eminem
Bach[Feature 2-48], Melody (large energy)
Dream Toy Eminem
[2]
31/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?1/2 in music genre classification task
Layer 3/5BachOriginal
Dream Toy Eminem
Bach[Feature 3-1], Better onset detector
Dream Toy Eminem
Bach[Feature 3-7], Melody (top note)
Dream Toy Eminem
Bach[Feature 3-38], Kick drum extractor
Dream Toy Eminem
Bach[Feature 3-40], Percussive eraser
Dream Toy Eminem
[2]
32/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?1/2 in music genre classification task
Layer 4/5BachOriginal
Dream Toy Eminem
Bach[Feature 4-5], Lowest notes selector
Dream Toy Eminem
Bach[Feature 4-11], Vertical line eraser
Dream Toy Eminem
Bach[Feature 4-30], Long horizontal line selector
Dream Toy Eminem
[2]
33/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?1/2 in music genre classification task
Layer 5/5BachOriginal
Dream Toy Eminem
Bach[Feature 5-11], texture 1
Dream Toy Eminem
Bach[Feature 5-15], texture 2
Dream Toy Eminem
Bach[Feature 5-56], Harmo-Rhythmic structure
Dream Toy Eminem
Bach[Feature 5-33], texture 3
Dream Toy Eminem
[2]
34/43
ConvolutionalNeural
Networks
Overview
CNNs vs DNNs
CNN structures
Inside CNNs
CNN use-cases
References
What is learned in CNNs?2/2 in music tagging task: Learn the transform!
Audio → 2-D representation
[3]
35/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
Image
Music
References
CNN use-casesVisual image recognition
36/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
Image
Music
References
CNN use-casesImage segmentation
[12]
37/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
Image
Music
References
CNN use-casesArtistic style
[4]
38/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
Image
Music
References
CNN use-casesMusic information retrieval
Anything people can do by seeing spectrograms
E.g. Auto tagging [1], chord recognition [5], instrumentrecognition [7], music-noise segmentation [8], onsetdetection [9], boundary detection [10]
+ style change? source separation? effects/de-effects?
39/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
References
References I
Choi, K., Fazekas, G., Sandler, M.: Automatic taggingusing deep convolutional neural networks. In: Proceedingsof the 17th International Society for Music InformationRetrieval Conference (ISMIR 2016), New York, USA (2016)
Choi, K., Fazekas, G., Sandler, M.: Explainingconvolutional neural networks on music classification(submitted). In: IEEE Conference on Machine Learningand Signal Processing (2016)
Dieleman, S., Schrauwen, B.: End-to-end learning formusic audio. In: Acoustics, Speech and Signal Processing(ICASSP), 2014 IEEE International Conference on. pp.6964–6968. IEEE (2014)
40/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
References
References II
Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithmof artistic style. arXiv preprint arXiv:1508.06576 (2015)
Humphrey, E.J., Bello, J.P.: From music audio to chordtablature: Teaching deep convolutional networks toplayguitar. In: Acoustics, Speech and Signal Processing(ICASSP), 2014 IEEE International Conference on. pp.6974–6978. IEEE (2014)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.:Gradient-based learning applied to document recognition.Proceedings of the IEEE 86(11), 2278–2324 (1998)
Li, P., Qian, J., Wang, T.: Automatic instrumentrecognition in polyphonic music using convolutional neuralnetworks. arXiv preprint arXiv:1511.05520 (2015)
41/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
References
References III
Park, T., Lee, T.: Music-noise segmentation inspectrotemporal domain using convolutional neuralnetworks. ISMIR late-breaking session (2015)
Schluter, J., Bock, S.: Improved musical onset detectionwith convolutional neural networks. In: InternationalConference on Acoustics, Speech and Signal Processing.IEEE (2014)
Ullrich, K., Schluter, J., Grill, T.: Boundary detection inmusic structure analysis using convolutional neuralnetworks. In: Proceedings of the 15th International Societyfor Music Information Retrieval Conference (ISMIR 2014),Taipei, Taiwan (2014)
42/43
ConvolutionalNeural
Networks
Overview
CNN use-cases
References
References IV
Zeiler, M.D., Fergus, R.: Visualizing and understandingconvolutional networks. In: Computer Vision–ECCV 2014,pp. 818–833. Springer (2014)
Zheng, S., Jayasumana, S., Romera-Paredes, B., Vineet,V., Su, Z., Du, D., Huang, C., Torr, P.H.: Conditionalrandom fields as recurrent neural networks. In: Proceedingsof the IEEE International Conference on Computer Vision.pp. 1529–1537 (2015)
43/43