breast cancer diagnosis via neural network classification jing jiang may 10, 2000

Breast Cancer Diagnosis via Neural Network Classification

Jing Jiang

May 10, 2000

Outline

• Introduction and Motivation

• K-mean, k-nearest neighbor and maximum likelihood classification

• Back propagating multi-layer perceptron

• Support vector machine (SVM)

• Learning vector quantization (LVQ)

• Linear programming

Introduction and Motivation

• The data file contains the 30 attributes of both benign and malignant fine needle aspirates (FNAs).

• Our goals are to find a discriminating function to determine if an unknown sample is benign or malignant and choose a pair of the 30 attributes which will be used in diagnosis.

• Linear program has done a good job in solving this problem.

• We expect the neural network classification algorithms can be useful for this problem.

K-mean

• First we use k-mean algorithm to find the cluster of the training data set.

• K-mean algorithm doesn’t give up the discriminating function

KNN and ML

• For 100 nearest neighbors we have,

• For 20 nearest neighbors we have

• For maximum likelihood algorithm we have,

20.94_

65.95_

36.75_

BP-MLP

• After careful choice of network parameters, we get the same Cmat and C-rate for the 30 attribute and any 2 attribute problem.

• It is interesting to note they are the same as the result we get for ML method

• The low classification rate can be due to the fact that the data is not linearly separable.

62.75_

Support Vector Machine

• For attribute 1 and 23, we have 6 errors in the testing.

• For attribute 14 and 28, we have 8 errors in testing.

• It takes a long time to train a SVM for the 30 attribute problem, even for 2 attribute, it is time consuming too.

• While using LVQ for attribute 1 and 23, the number of errors is 8.

• For attribute 14 and 18, we have 25 errors.

• The training is faster than SVQ, but so far we are only able to handle the 2 attribute problem, not a 30 attribute problem.

LVQ Training data and Weights

Linear Program

• The algorithm used is similar to SVM, but simpler.

• We device a separation plane and try to minimize the error.

• For 30 attribute we have only 3 errors

• For 2 attribute, the best combinations give 2 errors.

Linear Program

Conclusion

• We tried various neural network classification algorithm. It seems as far the simpler linear programming gives a better result. More exploration need to be done.

• BP is not very good at dealing with non-separable data.

• SVM is a good candidate, but takes a long time to train.

• LVQ is comparable with SVM.

• An question remain to be answered, why the maximum likelihood method give the same result as the BP.

breast cancer diagnosis via neural network classification jing jiang may 10, 2000

Documents

exploiting domain structure for named entity recognition...

investigation of atmospheric recycling rate from observation...

extracting interest tags from twitter user biographies ying...

generating supplementary travel guides from social media liu...

ja-da (ja da, ja da, jing jing jing!)

zhe jiang jing yong jing duann machinery co., ltd ... ·...

a probabilistic approach to personalized tag recommendation...

microplastic impacts on microalgae growth: e ects of size...

complexity and approximation of the minimum recombinant...

robotics, control and€¦ · qiaoling yuan, shiming ji,...

a survey of nlp toolkits jing jiang mar 8, 2007. 03/08/20072...

entity summaries jing jiang and xu lin beespace...

northumbria research linknrl.northumbria.ac.uk/35292/1/jiang...

comparing twitter and traditional media using topic...

jing-bo jiang, jia-mu wang and lin-bao luo* a...

multi-task transfer learning for weakly- supervised relation...

coling 2012 extracting and normalizing entity-actions from...

lean on me card by jing-jing nickel

soft decision decoding algorithms of reed-solomon codes jing...

topicsketch: real-time bursty topic detection from...