[ieee 2007 ieee international conference on electro/information technology - chicago, il, usa...

4
IEEE EIT 2007 Proceedings 428 . 1-4244-0941-1/07/$25.00 c 2007 IEEE Detection of Breast Cancer Using Independent Component Analysis Fadi Abu-Amara, Member, IEEE and Ikhlas Abdel-Qader, Senior Member, IEEE Abstract - Screening mammograms remain the best method to protect women from breast cancer. To increase the value of this modality and reduce the strain on the radiologists; automation of detection is a necessity. In this paper we investigate combining principal component analysis (PCA) with independent Compo- nent Analysis (ICA) to identify regions of suspicious (ROS) from digitized mammographic films. The experimental results show that this combination has an accuracy of 79% in detecting ab- normalities and 71.2% accuracy in the case of diagnosing the abnormality as benign or malignant. I. INTRODUCTION Breast cancer is the most common cancer among women in the US, other than skin cancer and after lung cancer [1]. American Cancer Society estimates that about 40,970 women died in 2006 due to breast cancer and, on average, every 15 minutes 5 women are diagnosed with breast cancer. Right now there are over 2 million women in the US have been treated from breast cancer [1]. Also, early detection of this disease improves the treatment options. Detection of suspicious abnormalities is a repetitive task that causes fatigue and eye strain. For every thousand cases analyzed by a radiologist, only 3 to 4 are cancerous and thus an abnormality may be ignored. Computer-Aided Detection (CAD) systems have been developed to assist radiologists in detecting mammographic lesions that may indicate the pres- ence of breast cancer. These systems act only as a second reader and the final decision is left to the radiologist. These systems have improved radiologist’s accuracy of detection of breast cancer [2]. Many algorithms have been proposed in the literature, some of them were used to detect masses while others were used to detect microcalcifications. Specifically, Directional Filtering with Gabor Wavelets [3], Median filter [4], and Iris filter [5]. Others used Fractal modeling [6], discrete wavelet transform [7], fuzzy-genetic approach [8], and artificial intel- ligent techniques [9]. In this paper, we present an algorithm for ROS detection. It uses PCA for dimensionality reduction followed by ICA for feature selection. A simple distance measure, Euclidian dis- tance, but powerful is used in this work to classify benign or F. Abu-amara is with the Department of Electrical and Computer Engineer ing, Western Michigan University, Kalamazoo, MI 49008, USA, (e- mail:[email protected]). I. Abdel-Qader is with the Department of Electrical and Computer Engi- neering, Western Michigan University, Kalamazoo, MI 49008, USA, (e- mail:[email protected]). malignant tissue. Next section explains PCA and ICA algo- rithms while section III, presents the proposed PCA-ICA al- gorithm. In Section IV, the experimental results are discussed and the conclusions are presented in Section V. II. PCA AND ICA ALGORITHMS The PCA algorithm consists of two phases. The first phase is to find v orthogonal and uncorrelated vectors and the second one is to project the given data set into a subspace spanned by these v vectors [13]. Using PCA as a preprocess- ing step for ICA will not affect the ICA performance. Since the original sub-images will be represented with a new linear combination. Also, the higher order relationship between the original sub-images will be preserved. Since the information, such as breast edge and chest skin, the mammographic images contains is mixed, ICA can be used to separate these information since it is considered as a signal separation technique. The detection system perfor- mance relies on presenting all the existing information in mammograms to the detection system. However, introducing all the irrelevant and relevant information will increase the classifier task complexity as will as affects its accuracy. ICA is considered as a signal processing technique that is used as a new feature extraction technique in mammograms. It is basically used to find a linear non-orthogonal coordinate system (un-mixing or separating matrix) in multivariate data such that the resulting signals as statistically independent from each other as possible. The axes directions are deter- mined by the data’s first, second, and higher order statistic cumulants and moments. ICA extracts possible hidden infor- mation from the mammographic image that may lie beneath its observed regions. The mammographic image (X) is re- garded as a mixture of linear combination of statistically in- dependent source regions (S): S A X . = (1) Where A is the mixing matrix. Its coefficients describe the mixed source regions in a dis- tinctive way. By using ICA techniques to estimate both the source regions and the mixing matrix, the coefficients of the mixing matrix can be used as extracted features from the normal and abnormal regions. If an observation matrix X is formed with N rows and each row consists of an extracted normal/abnormal region and then fed into an unsupervised

Upload: ikhlas

Post on 29-Mar-2017

217 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: [IEEE 2007 IEEE International Conference on Electro/Information Technology - Chicago, IL, USA (2007.05.17-2007.05.20)] 2007 IEEE International Conference on Electro/Information Technology

IEEE EIT 2007 Proceedings 428.

1-4244-0941-1/07/$25.00 c©2007 IEEE

Detection of Breast Cancer Using

Independent Component Analysis

Fadi Abu-Amara, Member, IEEE and Ikhlas Abdel-Qader, Senior Member, IEEE

Abstract - Screening mammograms remain the best method to

protect women from breast cancer. To increase the value of this

modality and reduce the strain on the radiologists; automation of

detection is a necessity. In this paper we investigate combining

principal component analysis (PCA) with independent Compo-

nent Analysis (ICA) to identify regions of suspicious (ROS) from

digitized mammographic films. The experimental results show

that this combination has an accuracy of 79% in detecting ab-

normalities and 71.2% accuracy in the case of diagnosing the

abnormality as benign or malignant.

I. INTRODUCTION

Breast cancer is the most common cancer among women in

the US, other than skin cancer and after lung cancer [1].

American Cancer Society estimates that about 40,970 women

died in 2006 due to breast cancer and, on average, every 15

minutes 5 women are diagnosed with breast cancer. Right

now there are over 2 million women in the US have been

treated from breast cancer [1]. Also, early detection of this

disease improves the treatment options.

Detection of suspicious abnormalities is a repetitive task

that causes fatigue and eye strain. For every thousand cases

analyzed by a radiologist, only 3 to 4 are cancerous and thus

an abnormality may be ignored. Computer-Aided Detection

(CAD) systems have been developed to assist radiologists in

detecting mammographic lesions that may indicate the pres-

ence of breast cancer. These systems act only as a second

reader and the final decision is left to the radiologist. These

systems have improved radiologist’s accuracy of detection of

breast cancer [2].

Many algorithms have been proposed in the literature,

some of them were used to detect masses while others were

used to detect microcalcifications. Specifically, Directional

Filtering with Gabor Wavelets [3], Median filter [4], and Iris

filter [5]. Others used Fractal modeling [6], discrete wavelet

transform [7], fuzzy-genetic approach [8], and artificial intel-

ligent techniques [9].

In this paper, we present an algorithm for ROS detection. It

uses PCA for dimensionality reduction followed by ICA for

feature selection. A simple distance measure, Euclidian dis-

tance, but powerful is used in this work to classify benign or

F. Abu-amara is with the Department of Electrical and Computer Engineer

ing, Western Michigan University, Kalamazoo, MI 49008, USA, (e-

mail:[email protected]).

I. Abdel-Qader is with the Department of Electrical and Computer Engi-

neering, Western Michigan University, Kalamazoo, MI 49008, USA, (e-

mail:[email protected]).

malignant tissue. Next section explains PCA and ICA algo-

rithms while section III, presents the proposed PCA-ICA al-

gorithm. In Section IV, the experimental results are discussed

and the conclusions are presented in Section V.

II. PCA AND ICA ALGORITHMS

The PCA algorithm consists of two phases. The first phase

is to find v orthogonal and uncorrelated vectors and the

second one is to project the given data set into a subspace

spanned by these v vectors [13]. Using PCA as a preprocess-

ing step for ICA will not affect the ICA performance. Since

the original sub-images will be represented with a new linear

combination. Also, the higher order relationship between the

original sub-images will be preserved.

Since the information, such as breast edge and chest skin,

the mammographic images contains is mixed, ICA can be

used to separate these information since it is considered as a

signal separation technique. The detection system perfor-

mance relies on presenting all the existing information in

mammograms to the detection system. However, introducing

all the irrelevant and relevant information will increase the

classifier task complexity as will as affects its accuracy.

ICA is considered as a signal processing technique that is

used as a new feature extraction technique in mammograms.

It is basically used to find a linear non-orthogonal coordinate

system (un-mixing or separating matrix) in multivariate data

such that the resulting signals as statistically independent

from each other as possible. The axes directions are deter-

mined by the data’s first, second, and higher order statistic

cumulants and moments. ICA extracts possible hidden infor-

mation from the mammographic image that may lie beneath

its observed regions. The mammographic image (X) is re-

garded as a mixture of linear combination of statistically in-

dependent source regions (S):

SAX .= (1)

Where A is the mixing matrix.

Its coefficients describe the mixed source regions in a dis-

tinctive way. By using ICA techniques to estimate both the

source regions and the mixing matrix, the coefficients of the

mixing matrix can be used as extracted features from the

normal and abnormal regions. If an observation matrix X is

formed with N rows and each row consists of an extracted

normal/abnormal region and then fed into an unsupervised

Page 2: [IEEE 2007 IEEE International Conference on Electro/Information Technology - Chicago, IL, USA (2007.05.17-2007.05.20)] 2007 IEEE International Conference on Electro/Information Technology

IEEE EIT 2007 Proceedings 429.

learning algorithm, the source regions that generate the sub-

images can be estimated as, where W is the separating matrix:

XWS .= (2)

III. THE PCA-ICA ALGORITHM

The extracted sub-images are divided into two groups. The

main features of the first group, which is used for the training

procedure, are extracted. The other one is used for the testing

procedure and its main features are extracted. The classifier

then will classify each sub-image during the testing procedure

into normal/abnormal and later benign/malignant. The fol-

lowing figure shows the main steps of the proposed algorithm.

Fig. 1. The proposed algorithm.

A. Sub-Images Generation

1) MIAS database has a total of 119 ROS (51 malignant and

68 benign). Two different sets, each one consists of 119

ROS, of abnormal sub-images cropped and scaled to

35x35 and 45x45 pixels based on the center of each ab-

normality.

2) Five different sets of normal sub-images were cropped

and scaled randomly from the MIAS mammograms.

3) The 119 ROS were mixed with 119 normal sub-images

and then divided into two groups; one for training and the

other one for testing as shown in table1.

B. Training Procedure Using PCA-ICA Algorithm

A training matrix Atrain

is defined, where each row con-

tains a sub-image, with dimension NxM. Where N is the total

number of trained sub-images and M is the dimension of each

sub-image (either 35x35 or 45x45). Also, as with PCA algo-

rithm all sub-images in the AT matrix are normalized. PCA is

used to obtain V principle components and estimate the re-

duced matrix asR

MxVA . The covariance matrix calculated

based on

MxV

RtrainNXV

AAC .= (3)

The transpose of the reduced matrix is computedT

RA and it

is used to estimate the W and S matrices in an unsupervised

mode.

C. Unsupervised Learning Algorithm

To estimate the separating matrix W and the independent

source regions S, W is initialized to start with the identity

matrix then

S = W. T

RA

(4)

Once determined, minimum mutual information algorithm

is used [12] to estimate the non-linear function phi(s), which

is used to approximate the marginal pdf of the output regions

S in order to achieve a maximal statistical independence of

the source regions, by using 3rd and 4th order moments of the

independent source regions Si. The independent source re-

gions are latent variables, meaning that they cannot be direct-

ly observed. Also the mixing matrix is assumed to be un-

known. All we observe is the random vectorT

RA , and we must

estimate both A and S by using it as follows:

k

ii xEm )( µ−= (5)

33 mk = And 344 −= mk (6)

Leading to Ф(S) as

3432

2431 ),(),( SkkfSkkf �� +=Φ

(7)

433431 .4

9

2

1),( kkkkkf +−= (8)

And 24

23443

4

3

2

3

6

1),(2 kkkkkf ++−= (9)

Where � denotes the Hadamard product of two matrices.

Then an estimate of ∆W is found by us-

ing WSsItηW T ])()[( Φ−=∆ , and the weights are updated by

using Wi(t+1) = W

i(t) + ∆W.

The previous steps will estimate VxMS and VxVW .The re-

duced dimensionality extracted features that can be used for

the training procedure are estimated by using equation 3; we

can reconstruct the training matrix as follows:

Normal

Extracted Sub-Images

Training Procedure

Testing Procedure

Procedure

Suspicious

Malignant

Benign

Page 3: [IEEE 2007 IEEE International Conference on Electro/Information Technology - Chicago, IL, USA (2007.05.17-2007.05.20)] 2007 IEEE International Conference on Electro/Information Technology

IEEE EIT 2007 Proceedings 430.

trainTRNxVrec AACA ≈= . (10)

And by using Equations (4 and 10) we have

SWCACA NxVTRNxVrec ... 1−== (11)

The reduced dimensionality extracted features:

1. −= WCR

NXVtrain (12)

D. Testing Procedure

A testing matrix Atest

is defined, where each row contains a

sub-image, with dimension of NxM. The regions in Atest

are

projected into the PCA space:

RMxVtestNxVtest AAB .=

(13)

The reduced dimensionality extracted features which can

be used for the training procedure:

1. −= WBR testNxVtest (14)

Euclidean distance measure, which is the most commonly

used one of distance measures, is used as a classifier to test

the algorithm performance. The estimated trained matrix with

each row has the extracted features from each sub-image is

Rtrain

. The estimated tested matrix with each row has the ex-

tracted features from each sub-image is Rtest

. Euclidean dis-

tance measures the distance between the current tested sub-

image and all the trained sub-images according to:

∑=

−=

V

i

iie yxD

1

2)( (15)

Then it chooses a trained sub-image that has the minimum

distance from the current tested sub-image.

IV. EXPERIMENTAL RESULTS

Table 2 shows the PCA-ICA algorithm accuracy versus

PCA and ICA algorithms accuracy. The algorithm accuracy is

defined as the ratio between the total number of correctly

classified sub-images (Nc) and the total number of tested sub-

images (N).

Table 2 shows that using ICA as a feature selection method

after the dimensionality reduction using PCA improves the

algorithm performance in all test sets than it for PCA only.

The best result for PCA for all test sets is 59.66% while 79%

for PCA-ICA. There are many important parameters that af-

fect the PCA-ICA algorithm accuracy. First, using PCA algo-

rithm to reduce data for the ICA algorithm will affect the total

algorithm accuracy. When large number of principal compo-

nents is selected, the extracted features will have large dimen-

sionality and therefore, will increase the classifier complexity.

If a small number is selected, the ICA algorithm performance

will be degraded since the independent source regions cannot

be estimated precisely. For the best test set, the best value is 6

largest principal components. Second, the learning rate )(tη

for computing the change in W will determine the speed of

convergence for ∆W. Third, there are many methods used to

compute the moments. Each method affects the algorithm

accuracy. Last, the way the normal and abnormal sub-images

were cropped and scaled.

TABLE 1 DIFFERENT SETS USED TO EVALUATE THE DETECTION ALGORITHM PERFOR-

MANCE

# Training set Testing set Size-pixels

ROS Normal Total ROS Normal Total

1 60 59 119 59 60 119 35x35

2 60 59 119 59 60 119 35x35

3 60 59 119 59 60 119 45x45

4 60 59 119 59 60 119 45x45

5 60 59 119 59 60 119 45x45

Table 2 also shows the experimental results of ICA and

PCA-ICA algorithms. The best result of applying ICA algo-

rithm is 73%. In contrast, the best result of applying the PCA-

ICA algorithm is 79%. These results indicate that using PCA

for dimensionality reduction improves the total algorithm

accuracy. Table 3 shows the experimental results using PCA-

ICA algorithm as a computer aided diagnosis system. The

best result is 71.2% where 14 malignant sub-images out of 25

are correctly classified and 28 benign sub-images out of 34

are correctly classified.

V. CONCLUSIONS

The performance of the proposed PCA-ICA algorithm is

compared against the performance of PCA and ICA algo-

rithms individually. The extensive experimental results indi-

cate that using ICA for feature selection after the dimensional-

ity reduction step using PCA improves the total PCA accuracy

about 32.42% and the total ICA algorithm accuracy about

8.2%. The best results are obtained with block size of 45x45

pixels. Future work includes investigating a Fuzzy classifier

instead of Euclidian classifier and using Hough transform as

preprocessing model of the images.

Page 4: [IEEE 2007 IEEE International Conference on Electro/Information Technology - Chicago, IL, USA (2007.05.17-2007.05.20)] 2007 IEEE International Conference on Electro/Information Technology

IEEE EIT 2007 Proceedings 431.

REFERENCES

[1] American cancer society, “Overview: Breast Cancer 2005,”

http://www.cancer.org/docroot/CRI/content/CRI_2_2_1X_How_many_

people_get_breast_cancer_5.asp?sitearea=. [2] M.L. Giger, N. Krassemeijer, and S.G. Armato,III, “Computer-aided

diagnosis in medical imaging,” IEEE Transactions on Medical Imaging,

vol. 20, pp. 1205-1208, 2001.

[3] R. J. Ferrari, R.M. Rangayyan, J.E.L. Desautels, and A.F. Frere, “Anal-

ysis of Asymmetry in Mammograms via Directional Filtering With Ga-

bor Wavelets”, IEEE Transactions on Medical Imaging, pp: 953–964,

v.20, n.9, 2001. [4] Shuk-Mei Lai, Xiaobo Li, and Water F. Bischof, “On Techniques for

Detecting Circumscribed Masses in Mammograms”, IEEE Transac-

tions on Medical Imaging, pp: 377–386, v.8, n.4, 1989.

[5] Hidefurni Kobatake, Member, IEEE, Masayuki Murakarni, Hideya

Takeo, and Sigeru Nawano, “Computerized Detection of Malignant

Tumors on Digital Mammograms”, IEEE Transactions on Information

Technology in Biomedicine, pp: 369–378, v.18, n.5, 1999.

[6] L. Bocchi, G. Coppini, J. Nori, G. Valli, “Detection of Single and Clus-

tered Microcalcifications in Mammograms Using Fractals Models and

Neural Networks”, Medical Engineering & Physics, pp: 303 – 312, v.

26, 2004.

[7] Lori Mann Bruce, Member, IEEE, and Reza R. Adhami, Member,

IEEE, “Classifying Mammographic Mass Shapes Using the Wavelet

Transform Modulus-Maxima Method”, IEEE Transactions on Medical

Imaging, pp: 1170 –1177, v.18, n.12, 1999.

[8] Carlos Andres Pena-Reyes, Moshe Sipper, “A Fuzzy-Genetic Approach

to Breast Cancer Diagnosis”, Artificial Intelligence in Medicine, pp:

131– 55, v.17, 1999.

[9] Lei Zheng and Andrew K. Chan, Senior Member, IEEE, “An Artificial

Intelligent Algorithm for Tumor Detection in Screening Mammogram”,

IEEE Transactions on Medical Imaging, pp: 559 – 567, v.20, n.7, 2001.

[10] Brijesh Verma and John Zakos, “A Computer-Aided Diagnosis System

for Digital Mammograms Based on Fuzzy Neural and Feature Extrac-

tion Techniques”, IEEE Transactions on Information Technology in

Biomedicine, pp: 46–54, v.5, n.1, 2001.

[11] I. Christoyianni, A. Koutras, E. Dermatas, G. Kokkinakis," Computer

aided diagnosis of breast cancer in digitized mammograms," Compute-

rized Medical Imaging and Graphics 26 (2002) 309– 319.

[12] Yang H, Amari S, Cichocki A. Information theoretic approach to blind

separation of sources in non-linear mixture. 1998; 291–300.

[13] Ikhlas Abdel-Qader, Lixin Shen, Christina Jacobs, Fadi Abu Amara,

and Sarah Pashaie-Rad, " Unsupervised Detection of Suspicious Tissue

Using Data Modeling and PCA," Hindawi Publishing Corporation In-

ternational Journal of Biomedical Imaging Volume 2006, Article ID

57850, Pages 1–11 DOI 10.1155/IJBI/2006/57850.

TABLE 2 F

P AND F

N AND TOTAL PCA, ICA, and PCA-ICA ALGORITHMS ACCURACY

Set PCA ICA PCA-ICA

PC FP

FN

Accuracy FP

FN

Accuracy PC F

P F

N Accuracy

1 6 20.17% 27.73% 52.1% 10.08% 40.34% 49.58% 20 24.37% 11.76% 63.87%

2 5 21.85% 18.49% 59.66% 10.08% 40.34% 49.58% 21 20.17% 17.64% 62.19%

3 5 23.53% 18.49% 57.98% 10.08% 40.34% 49.58% 5 22.69% 3.36% 73.95%

4 5 31.93% 15.97% 52.1% 10.08% 40.34% 49.58% 6 11.77% 10.92% 77.31%

5 5 47.06% 2.52% 50.42% 10.08% 40.34% 49.58% 6 15.97% 5.03% 79%

TABLE 3

COMPUTER AIDED-DIAGNOSIS USING PCA-ICA

Set Training set Testing set Size-pixels K PCA-ICA

Benign Malignant Total Benign Malignant Total FP FN Total Accuracy

1 34 26 60 34 25 59 35x35 15 10.17% 18.64% 71.19%

2 34 26 60 34 25 59 45x45 14 0 32.2% 67.8%

3 45 34 79 23 17 40 45x45 11 5% 35% 60%