neural networks - nmr.mgh.harvard.edu

30
Neural Networks Course 22525 Koen Van Leemput DTU HealthTech Technical University of Denmark

Upload: others

Post on 02-Dec-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Networks - nmr.mgh.harvard.edu

Neural Networks

Course 22525

Koen Van LeemputDTU HealthTech

Technical University of Denmark

Page 2: Neural Networks - nmr.mgh.harvard.edu

Course structure

Fitting functions

Registration Segmentation

Page 3: Neural Networks - nmr.mgh.harvard.edu

Remember regression?

– Training set

Input vector:

Corresponding output:

– Estimate parameters

by minimizing the cost

Page 4: Neural Networks - nmr.mgh.harvard.edu

Remember regression?

Example: p=1 and M=5 cosines

Page 5: Neural Networks - nmr.mgh.harvard.edu

Remember Gaussian mixture model?

Posterior using Bayes’ rule:

Page 6: Neural Networks - nmr.mgh.harvard.edu

Remember Gaussian mixture model?

Posterior using Bayes’ rule:

New notation/terminology:

- “Training samples”

- “y=1” if l=1

- “y=0” if l=2

Page 7: Neural Networks - nmr.mgh.harvard.edu

Remember Gaussian mixture model?

Posterior using Bayes’ rule:

New notation/terminology:

- “Training samples”

- “y=1” if l=1

- “y=0” if l=2

This lecture: can we get a “classifier”

directly

without a model?

Page 8: Neural Networks - nmr.mgh.harvard.edu

Logistic regression

– Logistic function as a “squashing” function

Page 9: Neural Networks - nmr.mgh.harvard.edu

Logistic regression

– where

– Of course:

Page 10: Neural Networks - nmr.mgh.harvard.edu

Voxel-based classifier– Training data

with (i.e., ) and– Estimate parameters

by maximizing the likelihood function

Page 11: Neural Networks - nmr.mgh.harvard.edu

Voxel-based classifier– Training data

with (i.e., ) and– Estimate parameters

by maximizing the likelihood function

Page 12: Neural Networks - nmr.mgh.harvard.edu

Voxel-based classifier– Training data

with (i.e., ) and– Estimate parameters

by maximizing the likelihood function

Page 13: Neural Networks - nmr.mgh.harvard.edu

Voxel-based classifier– Training data

with (i.e., ) and– Estimate parameters

by maximizing the likelihood function

Page 14: Neural Networks - nmr.mgh.harvard.edu

Voxel-based classifier– Once trained keep the classifier

– Simply apply it to new data

Page 15: Neural Networks - nmr.mgh.harvard.edu

Optimization algorithm for training

– Maximizing the likelihood function is equivalent to minimizing

– Gradient descent:

with gradient

– Stochastic gradient descent: use only randomly sampled training points, and approximate:

step size (user-specified)

Page 16: Neural Networks - nmr.mgh.harvard.edu

More fun: patch-based classifier

– Classify 3x3 image “patches”:

intensity of the pixel to be classified + intensities of 8 neighboring pixels

– is now a 9-dimensional vector ( ), but otherwise everything is the same:

– But how to choose basis functions in a 9-dimensional space?

Page 17: Neural Networks - nmr.mgh.harvard.edu

Basis functions in high dimensions?– Idea: remember the tensor B-

spline trick?

Example: take outer products of four 1D basis functions to “make” sixteen 2D basis functions

– Does this work in 9D?

2-D: “tensor B-spline”

Page 18: Neural Networks - nmr.mgh.harvard.edu

Basis functions in high dimensions?– Idea: remember the tensor B-

spline trick?

Example: take outer products of four 1D basis functions to “make” sixteen 2D basis functions

– Does this work in 9D?

2-D: “tensor B-spline”

No! 4^9 = 262144 basis functions!

Page 19: Neural Networks - nmr.mgh.harvard.edu

Adaptive basis functions

– Introduce extra parameters that alter the form of a limited set of basis functions

– Prototypical example:

– All parameters ( and ) optimized together during training (stochastic gradient descent)

Page 20: Neural Networks - nmr.mgh.harvard.edu

Adaptive basis functions

– Introduce extra parameters that alter the form of a limited set of basis functions

– Prototypical example:

– All parameters ( and ) optimized together during training (stochastic gradient descent)

Extra parameters

Page 21: Neural Networks - nmr.mgh.harvard.edu

Adaptive basis functions (p=1)

Page 22: Neural Networks - nmr.mgh.harvard.edu

Adaptive basis functions (p=1)

Page 23: Neural Networks - nmr.mgh.harvard.edu

Adaptive basis functions (p=1)

Page 24: Neural Networks - nmr.mgh.harvard.edu

Adaptive basis functions (p=2)

Page 25: Neural Networks - nmr.mgh.harvard.edu

Feed-forward neural network

So the model is

with basis functionsparameters

Page 26: Neural Networks - nmr.mgh.harvard.edu

Feed-forward neural networkGraphical representation of our 3x3 patch-based classifier: (p=9 and M=4)

– Can insert more than one “hidden” layer (“deep learning”)

flow of information

“hidden units”

Page 27: Neural Networks - nmr.mgh.harvard.edu

XXApplying

the trained classifier on new data:

Page 28: Neural Networks - nmr.mgh.harvard.edu

XX

filter

filter

filter

Applying the trained classifier on new data:

Page 29: Neural Networks - nmr.mgh.harvard.edu

XX

filter

filter

filter

Filtering operations can be implemented using

convolutions

=> “convolutional neural network”

Applying the trained classifier on new data:

Page 30: Neural Networks - nmr.mgh.harvard.edu

Neural networks = ultimate solution?

No model, only training data:

- No domain expertise needed

- Very easy to train and deploy

- Super fast (GPUs)

- Training data often very hard to get in medical imaging!

- Scanning hardware/software/protocol changes routinely!