powerpoint presentation - minnesota radiological … 1 bradley j erickson, md phd, fsiim, ciip...
TRANSCRIPT
20/3/2017
1
Bradley J Erickson, MD PhD, FSIIM, CIIP
Professor and Associate Chair, Radiology
Mayo Clinic, USA
Will a Computer Replace Me AndWhat Should I Do About It?
Disclosures
• Relationships with commercial interests:
– Board / Stock Owner:
• FlowSIGMA
• OneMedNet
• VoiceIT
– Research support from NVidia
– Chair, American Board of Imaging Informatics (CIIP Certification)
– Member, RSNA Radiology Informatics Committee
Learning Objectives
• Understand how new machine learning methods work and how they can be
applied to radiologic interpretation
• Understand impediments and appeals to applying machine learning to
Radiology and Medicine
1. Middle Management
2. Commodity Salespeople
3. Report writers, journalists, Authors & Announcers
4. Accountants & Bookkeepers
5. Medical Doctors
There is an 83% chance that workers earning <$20/hour will be replaced in less than 5 years.
--Obama Whitehouse Report
“Radiologists are Easy to Replace!”
“a highly trained and specialized radiologist may now be in
greater danger of being replaced by a machine than his own
executive assistant: She does so many different things that I
don’t see a machine being able to automate everything she
does any time soon.” -- Andrew Ng, The Economist
Radiologists Replaced in Four Years??!
• “Predicting the Future – Big Data, Machine Learning, and Clinical Medicine”
New England Journal of Medicine
• “The End of Radiology? Three Threats to the Future Practice of Radiology”
Journal of the American College of Radiology
– Ezekiel Emanuel, MD PhD: “…radiologists may be replaced by computers in
the next four to five years”
20/3/2017
2
What is “Machine Learning”?
• It is a part of Artificial Intelligence
• Finds patterns in data
– Patterns that reflect properties of examples (supervised)
– Patterns that group examples (unsupervised)
• (Other types of artificial intelligence include rules systems)
Machine Learning History
• Artificial Neural Networks (ANN)
– Starting point of machine learning
– Early versions didn’t work well
• Other Machine Learning Methods
– Naïve Bayes
– Support Vector Machine (SVM)
– Random Forest Classifier (RFC)
Artificial Neural Network/Perceptron
X
Y
Z
f(Σ)
f(Σ)
f(Σ)
f(Σ)
f(Σ)
f(Σ)
f(Σ)
Input Layer Hidden Layer Output Layer
Tumor
Normal
Artificial Neural Network/Perceptron
45
322
128
f(Σ)
f(Σ)
f(Σ)
f(Σ)
f(Σ)
f(Σ)
f(Σ)
T1 Pre
T1 Post
FLAIR
Tumor
Normal
Input Layer Hidden Layer Output Layer
20/3/2017
3
Artificial Neural Network/Perceptron
45
322
128
f(Σ)
f(Σ)
34
57
418
-68
312
Tumor
Normal
Input Layer Hidden Layer Output Layer
T1 Pre
T1 Post
FLAIR
Artificial Neural Network/Perceptron
45
322
128
1
0
34
57
418
-68
312
Tumor
Normal
Input Layer Hidden Layer Output Layer
T1 Pre
T1 Post
FLAIR
How ANNs Learn
• Output Node Value Computation
– Multiply input node value times a weight
– Activation function on inputs: e.g. threshold the sum of weighted inputs
• Weight Update
– Error = expected output – computed output
– New weight = old weight * error * learning rate
Deep Learning: Why the Hype?
Performance in ImageNet Challenge
2010 2011 2012 2013 2014 2015 2016
74% Hand Coded Computer Vision
HumanDeepLearning
100%
90%
80%
70%
What is “Deep Learning”
• “Deep” because it uses many layers
– ANN typically had 3 or fewer layers
An Example CNN
Convolution Pooling Conv Pooling Conv Pooling Conv Pool Fully Connected
20/3/2017
4
Typical CNNs
Andrei Karpathy: http://karpathy.github.io/2015/10/25/selfie/
C P C P P C P C P P C P C P P Fully Connected
Deep Learning: Why Now? 3 Reasons• Yes, lots more computation power
• Many critical theoretical advances
– without these, the power wouldn’t matter
• Substantial Investment
Theoretical Advances for Deep Learning1. Dropout
• Many layers & connections -> Overfitting
– Randomly ‘eliminate’ weights: DropoutConvolution / Regularization / Pooling Fully Connected
Theoretical Advances for Deep Learning2. Freezing layers (esp. pre-trained weights)
• Many layers -> Vanishing Gradients
– Drop out partially addresses this
– May use ‘pre-trained’ weights for early layers, and freeze those, with
weights of later layers for learning higher level features
Theoretical Advances for Deep Learning3. Batch Normalization
• What should be the initial set of weights that connect nodes?
– All the same = no gradients
– Should be random values. But what range?
Theoretical Advances for Deep Learning3. Batch Normalization
• What should be the initial set of weights connecting nodes?
– All the same = no gradients
– Should be random values. But what range?
• BatchNorm:
– For each convolutional layer, subtract mean / divide by standard deviation
– Simple, but surprisingly effective
20/3/2017
5
Novel Architectures
• Basic design is convolutions->Neural Nets
• Other designs include 1 whole image convolution (large scale features) and
local convolutions (local features) that are joined at ANN layer
Types of Layers
• Convolutional layers (mostly input for images)
• Pooling (often after C-layer. Example: MaxPool)
• Regularization (Rectified Linear Unit or ReLU)
• Residual Network (force each layer to learn—really a layer not a network, just
as CNN means convolution layers are included)
Why the Excitement Now?
1. Deep Neural Network Theory
2. Exponential Compute Power Growth
Moore’s Law
Computing performance doubles approximately every 18 months
General Purpose Computing on GPUs
• Use Graphics Cards for ‘Regular’ (Parallel) Computing
• Less precision (16-bit FP) actually can be advantage for training
• Cards with no video output and optimized for Deep Learning
Now Beating Moore’s Law By A Lot!
FPGA
Qubit?
FPGA
GPU
CPU
Ice Age 2000 2005 2010 2015 2020
1,000,000
100,000
10,000
1000
100
10
Deep Learning Myths
• “You Need Millions of Exams to Train and Use Deep Learning Methods”
So much power, you learn each example
20/3/2017
6
Deep Learning Myths
• “You Need Millions of Exams to Train and Use Deep Learning Methods”
Ways To Avoid Need For Large Data Sets
• Data Augmentation
– Create variants of data that are different enough that they help learning
– Similar enough that teaching point is kept
– Examples: Mirror/Flip/Rotate/Contrast/Crop/Noise
Ways To Avoid Need For Large Data Sets
• Data Augmentation
• Transfer Learning
Imag
e
Co
nv
Max
Poo
l
Fully
Co
nn
ecte
d
Soft
Max
Fully
Co
nn
ecte
d
Fully
Co
nn
ecte
d
Co
nv
Co
nv
Max
Poo
l
Co
nv
Co
nv
Max
Poo
l
Co
nv
Ways To Avoid Need For Large Data Sets
• Data Augmentation
• Transfer Learning
Imag
e
Co
nv
Max
Poo
l
Fully
Co
nn
ecte
d
Soft
Max
Fully
Co
nn
ecte
d
Fully
Co
nn
ecte
d
Co
nv
Co
nv
Max
Poo
l
Co
nv
Co
nv
Max
Poo
l
Co
nv
Freeze These Layers
Ways To Avoid Need For Large Data Sets
• Data Augmentation
• Transfer Learning
Imag
e
Co
nv
Max
Poo
l
Fully
Co
nn
ecte
d
Soft
Max
Fully
Co
nn
ecte
d
Fully
Co
nn
ecte
d
Co
nv
Co
nv
Max
Poo
l
Co
nv
Co
nv
Max
Poo
l
Co
nv
Freeze These Layers Train this
Benefit of DL vs Conventional ML
• Deep Learning Finds Features and Connections vs Just Connections
“Computers Programming Computers”
Hand-Crafted Feature Extraction
Learning Feature Extractor
Classifier
Classifier
20/3/2017
7
Machine Learning & Radiomics
• 159 astrocytomas with known 1p19q status split for train/test and validation
Sens Spec Accuracy
SVM 0.80 0.82 0.81
CNN-MultiRes no augment 0.84 0.73 0.79
CNN-MultiRes augmented 0.96 0.82 0.89
Akkus, CMIMI, 2016
Deep Learning: MGMT Methylation
• 155 patients, standard CNN vs Residual Network (100 train/test, 55 validation) :
Korfiatis, CMIMI, 2016
MGMT status PPV Sens Accuracy
Methylated 0.965 0.963 0.964
Unmethylated 0.961 0.920 0.918
MGMT status PPV Sens Accuracy
Methylated 0.661 0.745 0.701
Unmethylated 0.651 0.682 0.667
Deep Learning: Automated Kidney Segmentation
• PKD stuff
• Arterys CAD stuff
• Brain segmentation
Why the Excitement Now?
1. Deep Neural Network Theory
2. Exponential Compute Power Growth
3. Huge investment
Adding Gas to the Fire…
0
100
200
300
400
500
600
700
GE Siemens Philiips Epic Cerner McKesson Apple Google Facebook
*FinViz.com Sept 14, 2016
Mar
ket
Cap
ital
izat
ion
in B
illio
ns
$
Investment in Artificial Intelligence in
Healthcare: $1.5 Billion since 2013
https://www.cbinsights.com/blog/artificial-intelligence-startups-healthcare/
20/3/2017
8
The Pace of Change The Pace of Change
We always overestimate the change that will occur in the next 2 years and underestimate what will occur in the next 10.
---Bill Gates
2 Options: Will you engage or ignore? Deep Learning in Radiology
• Can likely produce reasonable ‘prelim reports’ for high volume exams for
common findings
– Advantage is that the prelim report is structured
– Changes reflect examples to be learned from
– No Change means it got it right
– Rapid generation of large validated data sets
• Will promote quantitative imaging
– Implicitly (or explicitly) identifies critical structures, and compares
measures of them versus some expected range of normal
– Will find metrics that are not intuitive and may be ‘textural’
Replacement Versus Complementing
• Algorithms for Machine Learning are rapidly improving.
• Hardware for Machine Learning is REALLY rapidly improving
• The amount of change in 20 years is unimaginable
• Be prepared! Think of Lotus 123 and accountants
– Make sure we apply computers to tasks that humans will happily surrender
20/3/2017
9
Will Computers Replace Radiologists?
• Deep Learning will likely be able to create draft reports for diagnostic images:
– 5 years: Mammo & CXR
– 10 years:
• CT: Head, Chest, Abd, Pelvis
• MR: Head, Knee, Shoulder
• US: liver, thyroid, carotids
– 15-20 years: most diagnostic imaging
• Will ‘see’ more & be more quantitative than today
• Will help to find similar cases that can help with obscure diagnosis
• Will allow radiologists for focus on patient interaction and invasive procedures
3 Drivers
• Theoretical Advances for Deep Learning
• Huge Leaps in Computing Power
• Boatloads of Money
3 Drivers
• Theoretical Advances for Deep Learning
• Huge Leaps in Computing Power
• Boatloads of Money
2 Options
• Engage Those Involved in Deep Learning for Medicine
• Watch Those Involved in Deep Learning for Medicine
3 Drivers
• Theoretical Advances for Deep Learning
• Huge Leaps in Computing Power
• Boatloads of Money
2 Options
• Engage Those Involved in Deep Learning for Medicine
• Watch Those Involved in Deep Learning for Medicine
1 Mission• Take Care of the Patient