intro to deep learning - mcgill university...computer vision and pattern recognition (cvpr), 2016...
Post on 21-May-2020
8 Views
Preview:
TRANSCRIPT
Intro to
Deep Learningfor
NeuroImaging
Andrew Doyle
@crocodoyle
McGill Centre for Integrative Neuroscience
Outline
1. GET EXCITED
2. Artificial Neural Networks
3. Backpropagation
4. Convolutional Neural Networks
5. Neuroimaging Applications
ImageNet-1000 Results
Image courtesy Aaron Courville, 2016
Generative Models
Deep Blood by Team BloodArtBrainBrushGatys, Leon A., Alexander S. Ecker, and Matthias Bethge. "Image style
transfer using convolutional neural networks." Computer Vision and
Pattern Recognition (CVPR), 2016 IEEE Conference on. IEEE, 2016.
Generative Models
Zhang, Han, et al. "StackGAN: Text to photo-realistic image synthesis
with stacked generative adversarial networks." arXiv preprint
arXiv:1612.03242 (2016).
StackGAN
Generative Models
Zhu, Jun-Yan, et al. "Unpaired image-to-image translation using cycle-
consistent adversarial networks." arXiv preprint
arXiv:1703.10593 (2017).
CycleGAN
Generative Models
Paired Data Unpaired Data
Wolterink, Jelmer M., et al. "Deep MR to CT synthesis using unpaired
data." International Workshop on Simulation and Synthesis in Medical
Imaging. Springer, Cham, 2017.
Generative Models
22/11/15
Vue.ai
Deep Reinforcement Learning
Mnih, Volodymyr, et al. "Playing atari with deep reinforcement
learning." arXiv preprint arXiv:1312.5602 (2013).
DQN - 600 epochs
Silver, David, et al. "Mastering the game of go without human
knowledge." Nature 550.7676 (2017): 354-359.
AlphaGo defeats Lee Sedol
Deep Reinforcement Learning
MoravΔΓk, Matej, et al. "Deepstack: Expert-level artificial intelligence in
no-limit poker." arXiv preprint arXiv:1701.01724(2017).
DeepStack
Introduction
For Deep Learning, you need:
1. Artificial Neural Network
2. Loss
3. Optimizer
4. Data
Artificial Neurons
Feedforward Recurrent
Artificial Neurons
π = π π₯ = π ππ»π + π
i1
i2
i3
o
w1i1
w2i2
w3i3b
x
Artificial Neurons
Artificial Neurons
i1
i2
i3
o π = π π₯ = π ππ»π + π
w1i1
w2i2
w3i3b
x
Logistic Regression
Neural Networks
x1
x2 h2
h1
y
i1
i2
o
Support
Vector
Machine
Input
Hid
den
Outp
ut
Neural Networks
x1
x2
h2
h1
y
h5
h4
h1
h2 h3
h4 h5 h6 h7
h3
h6
h7
x1 x2
y
Sethi, Ishwar Krishnan. "Entropy nets: From decision trees to neural
networks." Proceedings of the IEEE 78.10 (1990): 1605-1613
Neural Networks
x1
x2
h2
h1
y
h5
h4
h1
h2 h3
h4 h5 h6 h7
h3
h6
h7
x1 x2
y
Sethi, Ishwar Krishnan. "Entropy nets: From decision trees to neural
networks." Proceedings of the IEEE 78.10 (1990): 1605-1613
x1
x2
h2
h1
y
h9
h8
h3
h10
h11
h5
h4
h6
h13
h12
h14
h15
h7
h1
h2 h3
h4 h5 h6 h7
x1 x2
h8 h9 h11h10 h13h12 h14 h15
y
Neural Networks
x1
x2 h2
h1
y
π π₯2 = π(π2π€π₯2,π2 + ππ₯2)
π β2 = π(π€β2,π₯1π π₯1 +π€β2,π₯2π π₯2 + πβ2)
= π(π€β2,π₯1π π1π€π₯1,π1 + ππ₯1 +π€β2,π₯2π π2π€π₯2,π2 + ππ₯2 + πβ2)
π π¦ = π(π€π¦,β1π β1 + π€π¦,β2π β2 + ππ¦)
= π(π€π¦,β1π(π€β1,π₯1π π1π€π₯1+ ππ₯1
+π€1,π₯2π π2π€π₯2,π2 + ππ₯2 + πβ1)
+ π€π¦,β2π(π€β2,π₯1π π1π€π₯1,π1 + ππ₯1+π€β2,π₯2π π2π€π₯2,π2 + ππ₯2 + πβ2)
+ ππ¦)
i1
i2
o
17 parameters ΞΈ = {w, b}
Backpropagation
1. Random ΞΈ initialization
Iterate:
1. Forward - compute loss
2. Backward - update parameters
forward pass
backward pass
Backpropagation
x1
x2 h2
h1
y
i1
i2
i1 i2 o
0 0 0
0 1 1
1 0 1
1 1 0
ΰ·π¦ β π(π)
XOR
forward pass
backward pass
π½ π, ΰ·π¦ =1
2(π β ΰ·π¦)2
π»ππ½ π, ΰ·π¦ =ππ½
ππ€π₯1,π1
,ππ½
πππ₯1,
ππ½
ππ€π₯2,π2
,ππ½
πππ₯2, β¦ ,
ππ½
ππ€π¦,β2
π
Backpropagation
J
w
ππ½
ππ€
forward pass
backward pass
π€β² = π€ + πΌππ½
ππ€
learning rate
Backpropagation
x1
x2 h2
h1
y
i1
i2
ΰ·π¦ β π
ππ½
ππ€π¦,β1
=ππ½
π ΰ·π¦β
π ΰ·π¦
ππ€π¦,β1
=βπ ΰ·π¦ 1 β π ΰ·π¦ π β1
β¦
Backpropagation
x1
x2 h2
h1
y
i1
i2
ΰ·π¦ β π
ππ½
ππ€β1,π₯1
=ππ½
ππ¦βππ¦
πβ1β
πβ1ππ€β1,π₯1
ππ½
ππ€β2,π₯2
=ππ½
ππ¦βππ¦
πβ2β
πβ2ππ€β2,π₯2
Backpropagation
x1
x2 h2
h1
y
i1
i2
ΰ·π¦ β π
ππ½
ππ€π₯1,π1
=ππ½
ππ¦βππ¦
πβ1βπβ1ππ₯1
βππ₯1
ππ€π₯1,π1
+ππ½
ππ¦βππ¦
πβ2βπβ2ππ₯1
βππ₯1
ππ€π₯1,π1
Optimizers
approx. ππ½
ππ€in batches
1. Gradient Descent
2. Stochastic Gradient Descent
3. Momentum
4. Adagrad/adadelta
5. RMSprop
6. Adam
π£ = πΎπ£ + πΌππ½
ππ€π€β² = π€ + π£
param-wise decaying learning rate
avg. gradients
RMSprop + momentum
π€β² = π€ + πΌππ½
ππ€
Image courtesy Chris Olah, 2014
Convolutional Neural Networks
CNN/convnet neurons:
1. Have receptive field
2. Share weights
3. Max pooling
Images courtesy Vincent Dumoulin, 2016
Convolutional Neural Networks
CNN/convnet neurons:
1. Have receptive field
2. Share weights
3. Max pooling
Input
Output
Images courtesy Vincent Dumoulin, 2016
Convolutional Neural Networks
90% parametersAlexNet trained using:
1. Dropout
2. Batch Normalization
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet
classification with deep convolutional neural networks." Advances in
neural information processing systems. 2012.
Convolutional Neural Networks
Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet
classification with deep convolutional neural networks." Advances in
neural information processing systems. 2012.
Convolutional Neural Networks
Simonyan, Karen, and Andrew Zisserman. "Very deep convolutional
networks for large-scale image recognition." arXiv preprint
arXiv:1409.1556 (2014).
VGG16
Convolutional Neural Networks
ResNet
He, Kaiming, et al. "Deep residual learning for image
recognition." Proceedings of the IEEE conference on computer vision
and pattern recognition. 2016.
152 convolutional layers
Skip (residual) connections
GoogLeNet
Szegedy, Christian, et al. "Going deeper with
convolutions." Proceedings of the IEEE conference on computer vision
and pattern recognition. 2015.
1. Deep Supervision helps training
2. 1x1 convolutions can replace fully-connected layers
NeuroImaging Applications
1. Alzheimerβs Prediction
2. T1w MRI Quality Control
3. MRI Tissue Segmentation
4. PET Brain Extraction
id t x1 x2 x3 x4 x5 x6 DX
1 0 0.10 0.25 -0.20 0.01 Healthy
1 1 -0.20 0.01 Healthy
1 2 0.21 0.14 -0.31 0.01 MCI
1 3 0.12 0.32 -0.28 0.11 MCI
2 0 -0.01 0.35 -0.42 0.29 0.20 MCI
2 1 0.03 0.40 -0.82 MCI
2 2 0.10 0.89 -0.21 Alzheimerβs
Patient 1
Patient 2
β¦
β¦
id t x1 x2 x3 x4 x5 x6 DX
1 0 0.10 0.25 -0.20 0.01 -0.20 0.01 Healthy
1 1 0.10 0.25 -0.20 0.01 -0.20 0.01 Healthy
1 2 0.21 0.14 -0.31 0.01 -0.20 0.01 MCI
1 3 0.12 0.32 -0.28 0.11 -0.20 0.01 MCI
2 0 -0.01 0.35 -0.20 -0.42 0.29 0.20 MCI
2 1 0.03 0.40 -0.20 -0.82 0.29 0.20 MCI
2 2 0.10 0.89 -0.20 -0.21 0.29 0.20 Alzheimerβs
β¦
β¦
π(π·ππ‘+Ξπ‘|π·ππ‘, ππ‘)
X
512
1024 1024
512 512
374
Ξt
Ξt Ξt
Ξt Ξt
3
Ξt input layer
fully-connected layer
3-class softmax layer
π(π·ππ‘+Ξπ‘|π·ππ‘, ππ‘)
93% Accuracy
Automatic QC of T1w Brain MRI
P(QC|MRI)
16
32
32
64
64
256
256
256
128
2
3x3 convolutional layer
2x2 max pooling layer
fully-connected layer
2-class softmax layer
Automatic QC of T1w Brain MRI
+/- 10 voxels
Dataset Sensitivity Specificity
IBIS 97% 96%
Segmentation
Kamnitsas, Konstantinos, et al. "Efficient multi-scale 3D CNN with fully
connected CRF for accurate brain lesion segmentation." Medical image
analysis 36 (2017): 61-78.
DeepMedic
Segmentation
ΓiΓ§ek, ΓzgΓΌn, et al. "3D U-Net: learning dense volumetric
segmentation from sparse annotation." International Conference on
Medical Image Computing and Computer-Assisted Intervention.
Springer International Publishing, 2016.
Dilated Convolutions
Yu F, Koltun V. Multi-scale context aggregation by dilated
convolutions. arXiv preprint arXiv:1511.07122. 2015 Nov 23.
Efficient Multi-scale
PET Brain Extraction
Funck, T. et al. Brain tissue segmentation from multiple PET radiotracers.
Poster at Montreal Artificial Intelligence in Neuroscience conference, 2017
prediction
FMZ
RCL
FDOPA
FDG
truthimage
Motion Estimation
Iglesias, Juan Eugenio, et al. "Retrospective head motion estimation in
structural brain MRI with 3D CNNs." International Conference on
Medical Image Computing and Computer-Assisted Intervention.
Springer, Cham, 2017.
Motion Estimation
PASS FAIL
Selvaraju, Ramprasaath R., et al. "Grad-CAM: Why did you say that?
visual explanations from deep networks via gradient-based
localization." arXiv preprint ArXiv:1610.02391 (2016).
Challenges
1. Data quantity
2. Data size
3. Data quality
4. Data expense
5. Data variability
6. Unexpected pathology
Start here
http://keras.io
http://www.deeplearningbook.org/
Autism Prediction
Heinsfeld, Anibal SΓ³lon, et al. "Identification of autism spectrum
disorder using deep learning and the ABIDE dataset." NeuroImage:
Clinical 17 (2018): 16.
Denoising autoencoders
top related