breast cancer diagnosis via neural network classification jing jiang may 10, 2000
TRANSCRIPT
![Page 1: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/1.jpg)
Breast Cancer Diagnosis via Neural Network Classification
Jing Jiang
May 10, 2000
![Page 2: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/2.jpg)
Outline
• Introduction and Motivation
• K-mean, k-nearest neighbor and maximum likelihood classification
• Back propagating multi-layer perceptron
• Support vector machine (SVM)
• Learning vector quantization (LVQ)
• Linear programming
![Page 3: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/3.jpg)
Introduction and Motivation
• The data file contains the 30 attributes of both benign and malignant fine needle aspirates (FNAs).
• Our goals are to find a discriminating function to determine if an unknown sample is benign or malignant and choose a pair of the 30 attributes which will be used in diagnosis.
• Linear program has done a good job in solving this problem.
• We expect the neural network classification algorithms can be useful for this problem.
![Page 4: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/4.jpg)
K-mean
• First we use k-mean algorithm to find the cluster of the training data set.
• K-mean algorithm doesn’t give up the discriminating function
![Page 5: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/5.jpg)
KNN and ML
• For 100 nearest neighbors we have,
• For 20 nearest neighbors we have
• For maximum likelihood algorithm we have,
20.94_
143
151
rateC
and
Cmat
65.95_
152
151
rateC
and
Cmat
36.75_
017
052
rateC
and
Cmat
![Page 6: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/6.jpg)
BP-MLP
• After careful choice of network parameters, we get the same Cmat and C-rate for the 30 attribute and any 2 attribute problem.
• It is interesting to note they are the same as the result we get for ML method
• The low classification rate can be due to the fact that the data is not linearly separable.
62.75_
017
052
rateC
and
Cmat
![Page 7: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/7.jpg)
Support Vector Machine
• For attribute 1 and 23, we have 6 errors in the testing.
• For attribute 14 and 28, we have 8 errors in testing.
• It takes a long time to train a SVM for the 30 attribute problem, even for 2 attribute, it is time consuming too.
![Page 8: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/8.jpg)
LVQ
• While using LVQ for attribute 1 and 23, the number of errors is 8.
• For attribute 14 and 18, we have 25 errors.
• The training is faster than SVQ, but so far we are only able to handle the 2 attribute problem, not a 30 attribute problem.
![Page 9: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/9.jpg)
LVQ Training data and Weights
![Page 10: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/10.jpg)
Linear Program
• The algorithm used is similar to SVM, but simpler.
• We device a separation plane and try to minimize the error.
• For 30 attribute we have only 3 errors
• For 2 attribute, the best combinations give 2 errors.
![Page 11: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/11.jpg)
Linear Program
![Page 12: Breast Cancer Diagnosis via Neural Network Classification Jing Jiang May 10, 2000](https://reader036.vdocument.in/reader036/viewer/2022082819/56649f1f5503460f94c3729e/html5/thumbnails/12.jpg)
Conclusion
• We tried various neural network classification algorithm. It seems as far the simpler linear programming gives a better result. More exploration need to be done.
• BP is not very good at dealing with non-separable data.
• SVM is a good candidate, but takes a long time to train.
• LVQ is comparable with SVM.
• An question remain to be answered, why the maximum likelihood method give the same result as the BP.