powerpoint presentation - minnesota radiological … 1 bradley j erickson, md phd, fsiim, ciip...

9
20/3/2017 1 Bradley J Erickson, MD PhD, FSIIM, CIIP Professor and Associate Chair, Radiology Mayo Clinic, USA Will a Computer Replace Me And What Should I Do About It? Disclosures Relationships with commercial interests: Board / Stock Owner: FlowSIGMA OneMedNet VoiceIT Research support from NVidia Chair, American Board of Imaging Informatics (CIIP Certification) Member, RSNA Radiology Informatics Committee Learning Objectives Understand how new machine learning methods work and how they can be applied to radiologic interpretation Understand impediments and appeals to applying machine learning to Radiology and Medicine 1. Middle Management 2. Commodity Salespeople 3. Report writers, journalists, Authors & Announcers 4. Accountants & Bookkeepers 5. Medical Doctors There is an 83% chance that workers earning <$20/hour will be replaced in less than 5 years. --Obama Whitehouse Report “Radiologists are Easy to Replace!” a highly trained and specialized radiologist may now be in greater danger of being replaced by a machine than his own executive assistant: She does so many different things that I don’t see a machine being able to automate everything she does any time soon.” -- Andrew Ng, The Economist Radiologists Replaced in Four Years??! “Predicting the Future – Big Data, Machine Learning, and Clinical Medicine” New England Journal of Medicine •“The End of Radiology? Three Threats to the Future Practice of RadiologyJournal of the American College of Radiology Ezekiel Emanuel, MD PhD: “…radiologists may be replaced by computers in the next four to five years”

Upload: vandang

Post on 25-Apr-2018

214 views

Category:

Documents


1 download

TRANSCRIPT

20/3/2017

1

Bradley J Erickson, MD PhD, FSIIM, CIIP

Professor and Associate Chair, Radiology

Mayo Clinic, USA

Will a Computer Replace Me AndWhat Should I Do About It?

Disclosures

• Relationships with commercial interests:

– Board / Stock Owner:

• FlowSIGMA

• OneMedNet

• VoiceIT

– Research support from NVidia

– Chair, American Board of Imaging Informatics (CIIP Certification)

– Member, RSNA Radiology Informatics Committee

Learning Objectives

• Understand how new machine learning methods work and how they can be

applied to radiologic interpretation

• Understand impediments and appeals to applying machine learning to

Radiology and Medicine

1. Middle Management

2. Commodity Salespeople

3. Report writers, journalists, Authors & Announcers

4. Accountants & Bookkeepers

5. Medical Doctors

There is an 83% chance that workers earning <$20/hour will be replaced in less than 5 years.

--Obama Whitehouse Report

“Radiologists are Easy to Replace!”

“a highly trained and specialized radiologist may now be in

greater danger of being replaced by a machine than his own

executive assistant: She does so many different things that I

don’t see a machine being able to automate everything she

does any time soon.” -- Andrew Ng, The Economist

Radiologists Replaced in Four Years??!

• “Predicting the Future – Big Data, Machine Learning, and Clinical Medicine”

New England Journal of Medicine

• “The End of Radiology? Three Threats to the Future Practice of Radiology”

Journal of the American College of Radiology

– Ezekiel Emanuel, MD PhD: “…radiologists may be replaced by computers in

the next four to five years”

20/3/2017

2

What is “Machine Learning”?

• It is a part of Artificial Intelligence

• Finds patterns in data

– Patterns that reflect properties of examples (supervised)

– Patterns that group examples (unsupervised)

• (Other types of artificial intelligence include rules systems)

Machine Learning History

• Artificial Neural Networks (ANN)

– Starting point of machine learning

– Early versions didn’t work well

• Other Machine Learning Methods

– Naïve Bayes

– Support Vector Machine (SVM)

– Random Forest Classifier (RFC)

Artificial Neural Network/Perceptron

X

Y

Z

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

Input Layer Hidden Layer Output Layer

Tumor

Normal

Artificial Neural Network/Perceptron

45

322

128

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

f(Σ)

T1 Pre

T1 Post

FLAIR

Tumor

Normal

Input Layer Hidden Layer Output Layer

20/3/2017

3

Artificial Neural Network/Perceptron

45

322

128

f(Σ)

f(Σ)

34

57

418

-68

312

Tumor

Normal

Input Layer Hidden Layer Output Layer

T1 Pre

T1 Post

FLAIR

Artificial Neural Network/Perceptron

45

322

128

1

0

34

57

418

-68

312

Tumor

Normal

Input Layer Hidden Layer Output Layer

T1 Pre

T1 Post

FLAIR

How ANNs Learn

• Output Node Value Computation

– Multiply input node value times a weight

– Activation function on inputs: e.g. threshold the sum of weighted inputs

• Weight Update

– Error = expected output – computed output

– New weight = old weight * error * learning rate

Deep Learning: Why the Hype?

Performance in ImageNet Challenge

2010 2011 2012 2013 2014 2015 2016

74% Hand Coded Computer Vision

HumanDeepLearning

100%

90%

80%

70%

What is “Deep Learning”

• “Deep” because it uses many layers

– ANN typically had 3 or fewer layers

An Example CNN

Convolution Pooling Conv Pooling Conv Pooling Conv Pool Fully Connected

20/3/2017

4

Typical CNNs

Andrei Karpathy: http://karpathy.github.io/2015/10/25/selfie/

C P C P P C P C P P C P C P P Fully Connected

Deep Learning: Why Now? 3 Reasons• Yes, lots more computation power

• Many critical theoretical advances

– without these, the power wouldn’t matter

• Substantial Investment

Theoretical Advances for Deep Learning1. Dropout

• Many layers & connections -> Overfitting

– Randomly ‘eliminate’ weights: DropoutConvolution / Regularization / Pooling Fully Connected

Theoretical Advances for Deep Learning2. Freezing layers (esp. pre-trained weights)

• Many layers -> Vanishing Gradients

– Drop out partially addresses this

– May use ‘pre-trained’ weights for early layers, and freeze those, with

weights of later layers for learning higher level features

Theoretical Advances for Deep Learning3. Batch Normalization

• What should be the initial set of weights that connect nodes?

– All the same = no gradients

– Should be random values. But what range?

Theoretical Advances for Deep Learning3. Batch Normalization

• What should be the initial set of weights connecting nodes?

– All the same = no gradients

– Should be random values. But what range?

• BatchNorm:

– For each convolutional layer, subtract mean / divide by standard deviation

– Simple, but surprisingly effective

20/3/2017

5

Novel Architectures

• Basic design is convolutions->Neural Nets

• Other designs include 1 whole image convolution (large scale features) and

local convolutions (local features) that are joined at ANN layer

Types of Layers

• Convolutional layers (mostly input for images)

• Pooling (often after C-layer. Example: MaxPool)

• Regularization (Rectified Linear Unit or ReLU)

• Residual Network (force each layer to learn—really a layer not a network, just

as CNN means convolution layers are included)

Why the Excitement Now?

1. Deep Neural Network Theory

2. Exponential Compute Power Growth

Moore’s Law

Computing performance doubles approximately every 18 months

General Purpose Computing on GPUs

• Use Graphics Cards for ‘Regular’ (Parallel) Computing

• Less precision (16-bit FP) actually can be advantage for training

• Cards with no video output and optimized for Deep Learning

Now Beating Moore’s Law By A Lot!

FPGA

Qubit?

FPGA

GPU

CPU

Ice Age 2000 2005 2010 2015 2020

1,000,000

100,000

10,000

1000

100

10

Deep Learning Myths

• “You Need Millions of Exams to Train and Use Deep Learning Methods”

So much power, you learn each example

20/3/2017

6

Deep Learning Myths

• “You Need Millions of Exams to Train and Use Deep Learning Methods”

Ways To Avoid Need For Large Data Sets

• Data Augmentation

– Create variants of data that are different enough that they help learning

– Similar enough that teaching point is kept

– Examples: Mirror/Flip/Rotate/Contrast/Crop/Noise

Ways To Avoid Need For Large Data Sets

• Data Augmentation

• Transfer Learning

Imag

e

Co

nv

Max

Poo

l

Fully

Co

nn

ecte

d

Soft

Max

Fully

Co

nn

ecte

d

Fully

Co

nn

ecte

d

Co

nv

Co

nv

Max

Poo

l

Co

nv

Co

nv

Max

Poo

l

Co

nv

Ways To Avoid Need For Large Data Sets

• Data Augmentation

• Transfer Learning

Imag

e

Co

nv

Max

Poo

l

Fully

Co

nn

ecte

d

Soft

Max

Fully

Co

nn

ecte

d

Fully

Co

nn

ecte

d

Co

nv

Co

nv

Max

Poo

l

Co

nv

Co

nv

Max

Poo

l

Co

nv

Freeze These Layers

Ways To Avoid Need For Large Data Sets

• Data Augmentation

• Transfer Learning

Imag

e

Co

nv

Max

Poo

l

Fully

Co

nn

ecte

d

Soft

Max

Fully

Co

nn

ecte

d

Fully

Co

nn

ecte

d

Co

nv

Co

nv

Max

Poo

l

Co

nv

Co

nv

Max

Poo

l

Co

nv

Freeze These Layers Train this

Benefit of DL vs Conventional ML

• Deep Learning Finds Features and Connections vs Just Connections

“Computers Programming Computers”

Hand-Crafted Feature Extraction

Learning Feature Extractor

Classifier

Classifier

20/3/2017

7

Machine Learning & Radiomics

• 159 astrocytomas with known 1p19q status split for train/test and validation

Sens Spec Accuracy

SVM 0.80 0.82 0.81

CNN-MultiRes no augment 0.84 0.73 0.79

CNN-MultiRes augmented 0.96 0.82 0.89

Akkus, CMIMI, 2016

Deep Learning: MGMT Methylation

• 155 patients, standard CNN vs Residual Network (100 train/test, 55 validation) :

Korfiatis, CMIMI, 2016

MGMT status PPV Sens Accuracy

Methylated 0.965 0.963 0.964

Unmethylated 0.961 0.920 0.918

MGMT status PPV Sens Accuracy

Methylated 0.661 0.745 0.701

Unmethylated 0.651 0.682 0.667

Deep Learning: Automated Kidney Segmentation

• PKD stuff

• Arterys CAD stuff

• Brain segmentation

Why the Excitement Now?

1. Deep Neural Network Theory

2. Exponential Compute Power Growth

3. Huge investment

Adding Gas to the Fire…

0

100

200

300

400

500

600

700

GE Siemens Philiips Epic Cerner McKesson Apple Google Facebook

*FinViz.com Sept 14, 2016

Mar

ket

Cap

ital

izat

ion

in B

illio

ns

$

Investment in Artificial Intelligence in

Healthcare: $1.5 Billion since 2013

https://www.cbinsights.com/blog/artificial-intelligence-startups-healthcare/

20/3/2017

8

The Pace of Change The Pace of Change

We always overestimate the change that will occur in the next 2 years and underestimate what will occur in the next 10.

---Bill Gates

2 Options: Will you engage or ignore? Deep Learning in Radiology

• Can likely produce reasonable ‘prelim reports’ for high volume exams for

common findings

– Advantage is that the prelim report is structured

– Changes reflect examples to be learned from

– No Change means it got it right

– Rapid generation of large validated data sets

• Will promote quantitative imaging

– Implicitly (or explicitly) identifies critical structures, and compares

measures of them versus some expected range of normal

– Will find metrics that are not intuitive and may be ‘textural’

Replacement Versus Complementing

• Algorithms for Machine Learning are rapidly improving.

• Hardware for Machine Learning is REALLY rapidly improving

• The amount of change in 20 years is unimaginable

• Be prepared! Think of Lotus 123 and accountants

– Make sure we apply computers to tasks that humans will happily surrender

20/3/2017

9

Will Computers Replace Radiologists?

• Deep Learning will likely be able to create draft reports for diagnostic images:

– 5 years: Mammo & CXR

– 10 years:

• CT: Head, Chest, Abd, Pelvis

• MR: Head, Knee, Shoulder

• US: liver, thyroid, carotids

– 15-20 years: most diagnostic imaging

• Will ‘see’ more & be more quantitative than today

• Will help to find similar cases that can help with obscure diagnosis

• Will allow radiologists for focus on patient interaction and invasive procedures

3 Drivers

• Theoretical Advances for Deep Learning

• Huge Leaps in Computing Power

• Boatloads of Money

3 Drivers

• Theoretical Advances for Deep Learning

• Huge Leaps in Computing Power

• Boatloads of Money

2 Options

• Engage Those Involved in Deep Learning for Medicine

• Watch Those Involved in Deep Learning for Medicine

3 Drivers

• Theoretical Advances for Deep Learning

• Huge Leaps in Computing Power

• Boatloads of Money

2 Options

• Engage Those Involved in Deep Learning for Medicine

• Watch Those Involved in Deep Learning for Medicine

1 Mission• Take Care of the Patient