convolutional neural network based medical imaging ......• in the context of medical imaging, such...

ConvolutionalNeuralNetworkbasedMedicalImagingSegmentation:Recent

ProgressandChallenges

JiaxingTan

www.gc.cuny.edu

RoadMap

• Introduction• CNN based Models• Encoder-Decoder basedModels• GAN Based Models• Some Challenges• Conclusion

www.gc.cuny.edu

Introduction

• One key research topic in Medical Imaging is imagesegmentation.

• Image segmentation, or Semantic Segmentation, isa pixel level image understanding task which is toperform a pixel-by-pixel classification to decide theclass of each pixel.

1/53

www.gc.cuny.edu

Introduction

• In the context of Medical Imaging, suchsegmentation method could be utilized to solvethe problems such as nodule detection, anomalydetection and organ segmentation.

• Medical image suffers the fact of high noisy andlow quality, which makes it is much harder toperform segmentation on the medical images.

2/53

www.gc.cuny.edu

Introduction

Examples of lung CT slices with nodules in redmask, the nodules are quite small comparingto the whole image.

Findingtheneedleinthehaystack

3/53

www.gc.cuny.edu

Introduction

• Traditionally, medical image segmentation isperformed by hand-engineered feature basedclassification.

• One common attribute of such method is someempirical “magic numbers” are used forthresholding and preprocessing.

• Risk remains that those empirical values couldbe dataset specific which might impede themodel to be a general solution.

4/53

www.gc.cuny.edu

Introduction

• As the final goal of medical imaging is to buildCAD to serve the massive people, a robust andadaptive model is highly recommended.

• Convolutional Neural Network could performautomatically feature learning by itself. (Cast newlight!)

5/53

www.gc.cuny.edu

RoadMap


www.gc.cuny.edu

BasicStructureofCNN

• Convolutional Neural Networks are similar to thedeep neural networks as they are made up ofneurons that have learn-able weights and biases.

• In a tone of a more commonly seen introduction,CNN is a deep neural network, inspired by thebiology study of human cortex, constructed byfour types of layers:

• Input Layer• Convolution Layer• Sampling Layer• Fully Connected Layer

6/53

www.gc.cuny.edu

AnExampleofCNN

7/53

www.gc.cuny.edu

CNN-InputLayer

• The Input layer is in charge of reading datawith a predefined size without performingany changes to it.

• In figure below, the input layer reads in a CTscan image with size 256×256.

8/53

www.gc.cuny.edu

Convolution Layer

• 2D Convolution： An operation on two functions fand g, which produces a third function that can beinterpreted as a modified ("filtered") version of f.

• where * means convolution and · means ordinarymultiplication.

9/53

www.gc.cuny.edu

CNN- Convolution Layer

• The output of the convolution layer is k featuremaps, each generated by a convolution operationwith one kernel applied on the whole image.

• In figure below, there are 2 convolution layers.Conv1 has 32 kernels, each of size 7x7 , conv2 has64 kernels, each of size 5x5 .

10/53

www.gc.cuny.edu

Pooling Layer(2D)

Acceptsavolumeofsize W1×H1×D1Requirestwohyper-parameters:

– theirspatialextent F,– thestride S,

Producesavolumeofsize W2×H2×D2 where:– W2=(W1−F)/S+1W2=(W1−F)/S+1– H2=(H1−F)/S+1H2=(H1−F)/S+1– D2=D1

11/53

www.gc.cuny.edu

Fully-connectedlayer

Neurons in a fully connected layer have fullconnections to all activations in the previouslayer, as seen in regular Neural Networks. Theiractivations can hence be computed with amatrix multiplication followed by a bias offset.

12/53

www.gc.cuny.edu

CNN- FullyConnectedLayer

• The last layer of a CNN model is usually a fullyconnected layer which serves as an output layer. Inthe output layer, the number of neurons denotes thenumber of classes in the classification task.

13/53

www.gc.cuny.edu

SomeDeepLearningPackages

• One advantage of CNN is that several public packages areavailable.

• Instead of building your CNN from scratch, you couldtake advantage of the publicly available GPU supportpackages.

14/53

www.gc.cuny.edu

SomeDeepLearningPackages

• Caffe:http://caffe.berkeleyvision.org/• Lasagne:https://lasagne.readthedocs.io/en/latest/• TensorFlow:https://www.tensorflow.org/• Theano:http://deeplearning.net/software/theano/• Torch:http://torch.ch/• Pytorch:http://pytorch.org/• Mxnet:http://mxnet.io/

• Keras:https://keras.io/• Caffe2:https://cae2.ai/• Andthislistkeepincreasing…..• WhichoneIlikemost?Dependsonavailablecodeforthemodel…..

15/53

www.gc.cuny.edu

CNNModels

• The general idea is infer the class label of thecenter pixel(s) using its neighbors nearby.

• We classify them into three categories:• 2D CNN based• 3D CNN based• Holistic CNN based.

16/53

www.gc.cuny.edu

CNNModels

• BoundingBoxmodel:The2Dand3Dmethodsarebasedonwhethera2Dor3Dneighborhoodisconsideredforclassificationofthecenteredpixel(s).

• Asacomplementarytoboundingboxmethod,wealsointroducemethodsthatdoesnotrelyonboundingbox,namedHolistic.

17/53

www.gc.cuny.edu

2DCNNbasedlungnoduledetection

• Given a set of CT scans of a patient, which usuallycontains more than 300 slices depending on thebody size of the patient, radiologists will check thescan slice by slice to detect the nodule.

• For each scan, radiologists will observe every sub-region in it. This procedure is performed in 2D slices.

18/53

www.gc.cuny.edu


• To simulate this procedure, 2D CNN is used.• After applying some pre-processings such as lungsegmentation and noise elimination, the slice wouldbe cut into small sub-regions.

• Each region is an input into CNN and the output is adecision whether such region contains a nodule ornot.

• The combined result shows if a nodule exists in thisslice.

19/53

www.gc.cuny.edu


• Mostrecently,Nima,et.alcomparestheperformanceofseveralCNNstructuresonlungnoduledetectionwithMTANN.

Nima Tajbakhs, et al. ”Comparing two classes of end-to-end machinelearning models in lung nodule detection and classification: MTANNs vs. CNNs”, Pattern Recognition, in press.

20/53

www.gc.cuny.edu


• Whenradiologistscheckeachscan,forabetterinspection,besidesgoingthrougheachregionofthescan,theywillalsocheckthesameregionontheslicesbeforeorafterthecurrentslicetodecidewhetherthereisanoduleinside.

• Suchdetectionproceduretakesadvantageofthe3DnatureoftheCTscan,whichcouldalsobeaninspirationonthedesignofCNNbaseddetection.

21/53

www.gc.cuny.edu


• Traditionally,aCNNtakesa2Dmatrixasaninput.

• However,therearesomerecentpublicationsincomputervisionintroduce3DCNNforthetasksuchasvideoscenerecognitionor3Dobjectrecognition,whichachievepromisingresult.

22/53

www.gc.cuny.edu


• Intheareaoflungnoduledetection,duetothe3DnatureofCTscan,itwillbereasonabletoapply3DCNN.Someeffortshavealsobeenmade.

• Rushil Anirudh et.alhasapplied3DCNNonaweaklylabelledlungnoduledataset.HeusesavoxelvasainputintotheCNNtodecidewhetherthecenterpointvlocatedat(x;y;z)tobeanoduleornot.

• visdefinedas(x-w:x+w;y-w:y+w;z-h:z+h),whichmeansnotonlyneighborsofvinthesameslicebutalsotheneighborsonthepreviousandlattersliceareconsideredtomakethedecision.23/53

www.gc.cuny.edu

3DCNNbasedlungnoduledetection• Thisdesignisclosertohowradiologistperformslungnoduledetection.

• Asensitivityof80%for10falsepositivesperscanhasbeengivenontheirweaklylabelleddatasetasaresult.

Anirudh, Rushil, et al. ”Lung nodule detection using 3D convolutional neural networks trained on weakly labeled data.” SPIE Medical Imaging. International Society for Optics and Photonics, 2016.

24/53

www.gc.cuny.edu

HolisticCNNlungnoduledetection

• All the methods we have mentioned in the previoustwo sections mainly obey the pipeline so that thedetection result of a given slice is based on theresults from a group of sub-tasks which performnodule detection in each region of the slice with aslidingwindow.

• Such pipeline is not very efficient. The concern iswhether we could perform nodule detection on thewhole image to achieve the detection result withoutdividing it into sub-tasks.

25/53

www.gc.cuny.edu

YOLO

• It models detection as a regression problem.• It divides the image into an even (S × S)grid and simultaneously predicts (B)

boundingboxes, confidence in those boxes, and (C) class probabilities.• Each bounding box consists of 5 predictions: center of the box (relative to the

bounds of the grid cell), Width, height, and confidence.• These predictions are encoded as an S × S × (B ∗ 5 + C) tensor.

26/53

www.gc.cuny.edu

HolisticCNNlungnoduledetection

Gao, Mingchen, et al. ”Holistic classification of CT attenuation patterns for interstitial lung diseases via deep convolutional neural networks.” Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization (2016): 1-6.

Mingchen Gao et. al, has applied a holistic classification on lung CT scans to detect 6 different kinds of diseases. Although the task is lung disease detection instead of lung cancer, this paper casts some light on using a different pipeline to perform nodule detection.

27/53

www.gc.cuny.edu

RoadMap


www.gc.cuny.edu

Auto-encoderalikestructure

• An auto-encoder neural network is an unsupervisedlearning algorithm that applies backpropagation,setting the target values to be equal to the inputs.

• My understanding： Given an input x, the auto-encoder could pass only the information needed whilefiltering out all the noises.

Encoder Decoder

HiddenRepresentation

28/53

www.gc.cuny.edu

FCN

• Thefeaturesaremergedfromdifferentstagesintheencoderwhichvaryin coarsenessofsemanticinformation.

• Theupsampling oflearnedlowresolutionsemanticfeaturemapsisdoneusing deconvolutionswhichareinitializedwithbillinearinterpolationfilters.

• Excellentexamplefor knowledgetransferfrommodernclassifiernetworks likeVGG16,Alexnet toperformsemanticsegmentation

29/53

www.gc.cuny.edu

SegNet

• Auto-encoderAlikestructure(symmetric).• SegNet uses unpooling toupsample featuremapsin

decodertouseandkeephighfrequencydetailsintactinthesegmentation.

• Thisencoderdoesn’tusethefullyconnectedlayers(byconvolutionizing themasFCN)andhenceislightweightnetworklesserparameters.

30/53

www.gc.cuny.edu

U-Net

• U-Net simply concatenates the encoder feature maps toupsampled feature maps from the decoder at every stageto form a ladder like structure.

• The architecture by its skip concatenation connectionsallows the decoder at each stage to learn back relevantfeatures that are lost when pooled in the encoder.

31/53

www.gc.cuny.edu

Link-Net

• SimilartoU-Net• Linkeachencoderwithdecoder• EachencoderblockisaResNet Block• Reducedparameters

32/53

www.gc.cuny.edu

PSP-Net

• EmbeddifficultscenerycontextfeaturesinanFCNbasedpixelpredictionframework.

• Firstuseapretrained ResNet modelwiththedilatednetworkstrategytoextractthefeaturemap.

• Thenfusedifferentlevelfeaturesforfurtheranalysis.

33/53

www.gc.cuny.edu

• Given an input image (a)• Use CNN to get the feature map of the last convolutional layer (b)• A pyramid parsing module is applied to harvest different sub-region

representations, followed by up-sampling and concatenation layersto form the final feature representation, which carries both local andglobal context information in (c).

• Finally, the representation is fed into a convolution layer to get thefinal per-pixel prediction (d).

34/53

www.gc.cuny.edu

WhyGlobalFeatures?

Anillustrationtoshowcasetheimportanceofglobalspatialcontextforsemanticsegmentation.

35/53

www.gc.cuny.edu 36/53

www.gc.cuny.edu

RoadMap


www.gc.cuny.edu

GenerativeAdversarialNetwork (GAN)

The loss for the G network is:

The loss for the D network is

37/53

www.gc.cuny.edu

GAN BasedSegmentation

38/53

www.gc.cuny.edu

GAN BasedSegmentation

GivenadatasetofNtrainingimagesxn andacorrespondinglabelmapsyn,thelossisdefinedas:

Trainingtheadversarialmodel:

Trainingthesegmentationmodel:

39/53

www.gc.cuny.edu

RoadMap


www.gc.cuny.edu

SOMECHALLENGES

• Data Source• Data Preparation• Some Other Challenge:

– HPC support– High Cost– Multi-disciplinary Cooperation Required

40/53

www.gc.cuny.edu

DataSource

• CNN, as some other big data technologies, requires alarge enough dataset to learn the classification rules.

• Different from computer vision area, where large andclean benchmark dataset is available, limited lung noduledataset is available to the public.

• Most people have their own datasets containing differentnumbers of patients from various sources.

• Where to get data is a big challenge to perform a deeplearning based detection.

41/53

www.gc.cuny.edu

DataSource

• SPIE-AAPM-LUNGx dataset: a dataset used for a lung challengeoriginally to decide whether a nodule is benign or malignant.

• LIDC-IDRI: contains 1018 cases, the largest public database founded bythe Lung Image Database Consortium and Image Database ResourceInitiative. On the website lung nodule CT scan is available fordownload.

• ELCAP Public Lung Image Database: contains 50 lowdose thin-slicechest CT images with annotations for small nodules.

• NSCLC-Radiomics: contains 422 non-small cell lung cancer (NSCLC)patients

42/53

www.gc.cuny.edu

DataPreparation

• Themajorpurposesofthedatapreparationaretomakethetrainingdatalessconfusing,morefittoCNNandenrichdatasize.

• Tomakethedatalessconfusing,someliteraturesperformlungsegmentationtoreducenoise.

• Thenpossiblesmoothmethodscouldbeappliedtothesegmentedlungparts.

• Also,someotherunnecessaryparts,likesomelightdotsorair,couldbefilteredoutwiththresholdorothertechniques. 43/53

www.gc.cuny.edu

DataPreparation

• ForthepurposeofmodifyingthedatatobemorefitforCNN,onechallengetobementionedisthedifferencebetweenCTscanandaRGBimage.

• ForaRGBimage,itcontainsthreechannels,eachchannelhasdatarangingfrom0to255.ForaCTscan,ithasonlyonechannelwithdatarangingfrom[-1000,3000],whichismuchlarger.

• Basedonourexperiments,ifonedirectlyputsCTscanwithsuchlargerangeintoCNN,theperformancewillbelimited.

44/53

www.gc.cuny.edu

DataPreparation

• To make CT scan more similar to the image originallyprocessed by CNN in computer vision, one solution is torescale the data range of CT scan to [0, 255] .

• This could definitely cause information loss.

45/53

www.gc.cuny.edu

DataPreparation

• One idea has been raised in the literature that turn theone channel CT scan into three channels by separatingattenuations into three levels: low, normal and high.

• Then the three channeled image would be rescaled into[0, 255] . One benefit of this method is that the CTimage now is in the same format with a RGB image.

46/53

www.gc.cuny.edu

DataPreparation

• To enlarge the size of dataset to meet the need of bigdata by CNN, some methods such as image translationcould be applied to enlarge dataset.

• The generated ones are considered different from theoriginal image.

• Also, adding random noise, such as white noise, to theoriginal image, could also be a solution to enlargedataset.

47/53

www.gc.cuny.edu

DataPreparation

• One more thing is the issue of imbalanced dataset.• As nodule detection is a binary classification problem(Nodule or Non-nodule), to train a classifier, thedataset should be a balanced one, which means bothclasses have equal number of samples.

48/53

www.gc.cuny.edu

DataPreparation

• However, obviously, in a set of CT scans, the number ofslices containing nodule is much smaller than that ofslices do not contain nodules. So when preparing thetraining dataset.

• We need to balance the dataset to make the number oftwo types of samples, containing nodule or not, to beequal.

49/53

www.gc.cuny.edu

SomeOtherChanllenges

• HPC support: The training of CNN based modelrequires huge amount of calculations on huge amountof data. Even with the help of HPC can the CNN modelbe trained in a durable length of time. Nowadays,besides training CNN purely on CPU, CUDA acceleratedGPU has also been used for training as well.

• High Cost: As with the need of HPC, another challengeis cost. The support of HPC consumes large amount ofenergy and requires facilities.

50/53

www.gc.cuny.edu

SomeOtherChanllenges

• Multi-disciplinary Cooperation Required: Thedesign of a CNN based lung nodule detection systemrequires the cooperation from multiple disciplinarysuch as medical, radiology and computer science.

51/53

www.gc.cuny.edu

RoadMap


www.gc.cuny.edu

Conclusion

• We give a brief introduction on the recent progress ofusing CNN for lung nodule detection.

• A list of public packages as well as a list of public data aregiven.

• We can see that CNN has shown great potential in thearea of Medical Imaging Segmentation.

• Meanwhile, challenges still remain and researchers areworking on solving them.

52/53

www.gc.cuny.edu

Questions?

53/53

convolutional neural network based medical imaging ......• in the context of medical imaging, such...

Documents