patternrecognitiontechniquesforautomatic...

Computer Methods and Programs in Biomedicine (2005) 79, 135—149

Pattern recognition techniques for automaticdetection of suspicious-looking anomalies inmammograms

Tomasz Arodza,∗, Marcin Kurdziela,1, Erik O.D. Sevreb,2 David A. Yuenb,2

a Institute of Computer Science, AGH University of Science and Technology, al. Mickiewicza 30, 30-059,Krakow, Polandb Minnesota Supercomputing Institute, University of Minnesota, Minneapolis, MN 55455, USA

Received 15 March 2004; received in revised form 24 March 2005; accepted 25 March 2005

KEYWORDSMammogram analysis;Computer-aideddiagnosis;Machine learning

Summary We have employed two pattern recognition methods used commonly forface recognition in order to analyse digital mammograms. The methods are based onnovel classification schemes, the AdaBoost and the support vector machines (SVM).A number of tests have been carried out to evaluate the accuracy of these two al-gorithms under different circumstances. Results for the AdaBoost classifier methodare promising, especially for classifying mass-type lesions. In the best case the algo-rithm achieved accuracy of 76% for all lesion types and 90% for masses only. The SVMbased algorithm did not perform as well. In order to achieve a higher accuracy forthis method, we should choose image features that are better suited for analysingdigital mammograms than the currently used ones.© 2005 Elsevier Ireland Ltd. All rights reserved.

1. Introduction

Today breast cancer is the second major killeramong American women [1]. Recently, each yeararound 40,000 women dies from breast cancerand over 200,000 develops new cases of invasive

∗ Corresponding author. Tel.: +48 12 617 3497;fax: +48 12 633 9406.

E-mail addresses: [email protected] (T. Arodz),[email protected] (M. Kurdziel), [email protected](E.O.D. Sevre), [email protected] (D.A. Yuen)1 Tel.: +48 12 617 3497; fax: +48 12 633 9406.2 Tel.: +1 612 624 9801; fax: +1 612 625 3819.

breast cancer. Apart from the invasive cancer, atleast 55,000 cases of the in situ breast cancer arealso diagnosed each year. With current treatmentoptions, the 5-year survival rate for the localisedbreast cancer can be as high as 97%. However,this rate drops to 78% in the case of a regionallyadvanced disease and to 23% for the fast growingbreast cancer. Consequently, early detection of thebreast cancer is crucial for an efficient therapy.The American Cancer Society recommends everywoman older than around 40 years to undergo an-nually mammogram examination. The sensitivity ofscreening mammography can be improved with theuse of Computer-Aided Detection (CAD) systems.

0169-2607/$ — see front matter. © 2005 Elsevier Ireland Ltd. All rights reserved.doi:10.1016/j.cmpb.2005.03.009

136 T. Arodz et al.

Fig. 1 Global structure of the breast. Different majorregions of the breast include: nipple, fatty and glandu-lar tissue and skin. Courtesy of Ben Holtzmann with helpfrom Lilli Yang.

In Ref. [2] the interpretations of around 13,000screening mammograms by two experienced radiol-ogist supported by the CAD system were comparedto the interpretations given prior to the use ofCAD. These studies revealed a relative increase ofthe detection rate from 0.32 to 0.36%. At the sametime, there were no adverse effects on the recallrate or positive predictive value for biopsy.

The global structure of the breast is presentedin the Fig. 1. The breast tissue may contain twotypes of cancer indicators commonly evaluated byCAD systems, i.e., masses and microcalcifications.Microcalcifications appear on a mammogram im-age as small, bright tissue protrusions. Accordingto Ref. [3,4] the probability of malignant process

Fig. 2 A mammographic image depicting the breast re-gion with a size 5× 5 cm. On the image a cluster of micro–calcifications is marked. Such clusters frequently arisefrom pathological processes in breast tissues and thusare considered important in breast cancer diagnosis. Thecontrast on the image has been manually adjusted for thebest visibility of the micro–calcifications. Regions with aless dense tissue have been suppressed and are depictedas black areas.

within the breast depends on the number, distribu-tion and morphology of microcalcifications. Whennot in a group, microcalcifications are irrelevantfrom a diagnostic standpoint. If at least four or fivemicro–calcifications are present and form a cluster,the probability of an abnormal process within thebreast is significant. An example of a mammogramwith one such cluster is depicted in Fig. 2. Micro–calcifications vary significantly in size, shape anddistribution. Typically, homogenous clusters con-sisting of larger, round or oval micro–calcificationsindicate a benign process. An example cluster ofthis type is pictured in Fig. 3. On the other hand,heterogeneous clusters consisting of smaller micro–calcifications, and micro–calcifications with irregu-lar shapes are associated with a high risk of cancer.A good example of this type of cluster is picturedin Fig. 4. In practice it is very difficult to decidewhether micro–calcification clusters are benign ormalignant. In a significant number of cases the clus-ters cannot be observed clearly. Moreover, it is com-mon for a cluster of micro–calcifications to revealmorphological features that cannot be clearly clas-sified as being benign or malignant.

Other important breast abnormalities shown inmammograms are the masses. They may occur inv
arious parts of breasts and have different sizes,

Pattern recognition techniques for automatic detection of suspicious-looking anomalies 137

Fig. 3 A mammographic image depicting the breast re-gion with an area of 5 cm× 5 cm. In this image a clus-ter of micro–calcifications has been marked by the whitedashed oval. These micro–calcifications are small andround. Such appearances are sometimes referred to aspunctuated calcifications and are rarely associated withcancer. A benign process is more probable. The imagecontrast has been manually adjusted for best visibilityof the micro–calcifications. Consequently, regions with aless dense tissue have been suppressed and are depictedin dark areas.

shapes and boundaries. Guideline for assessing therisk of malignancy on the basis of radiological ap-pearance are given in Ref. [4]. From a diagnosticpoint of view, the most important feature of a massis the morphology of its boundary. In particular, aclose examination of a mass border allows one toassess the probability whether the mass is a malig-nant tumour. Masses with well-defined, sharp bor-ders are usually benign. An example of such a massis depicted in Fig. 5. In case of masses with lobu-lated shapes, as in Fig. 6, the risk of malignancyincreases. However, the most suspicious are themasses with ill-defined or spiculated margins. Anexample of a spiculated mass is pictured in Fig. 7.Such an appearance of a mass may indicate a ma-lignant, infiltrating tumour.

Apart from micro–calcifications and anomalousmasses, one can also recognize two other kinds ofabnormalities in mammograms. First, the mammo-gram may reveal a structural distortion, i.e., a sit-uation in which tissue in a breast region appearsto be pulled towards its centre. Second, the leftand right breasts may appear significantly different.This situation is referred to as an asymmetric den-sity. These two types of breast abnormalities can

Fig. 4 A mammographic image depicting the breast re-gion with an area of 5 cm× 5 cm. On the image a clusterof micro–calcifications has been marked, which are highlyheterogonous in both size and shape. This is referred toas pleomorphic calcifications. Furthermore, the clusteris isolated, i.e. no micro–calcifications can be identifiedin the surrounding tissue. This two observations suggesta high risk of breast cancer. The contrast on the imagehas been manually adjusted for best visibility. Regions ofa less dense tissue has been suppressed and are depictedas a black area.

rarely be targeted by the commonly used CAD sys-tems.

2. Background

The recognition of suspicious abnormalities indigital mammograms still remains a difficult task.There are at least several reasons for this sit-uation. First, mammography provides relativelylow contrast images, especially in case of densebreasts, commonly found in young women. Second,symptoms of the presence of abnormal tissuemay remain quite subtle. For example, spiculatedmasses that may indicate a malignancy are oftendifficult to detect, especially at an early stage ofdevelopment. Important abnormality markers, themicro–calcification clusters, are easier to detect.However, in both cases one has to decide, witha significant level of uncertainty, whether thedetected lesion is benign or malignant. For thesereasons, we need robust algorithms for enhancingmammogram contrast, segmentation, detection ofmicro–calcifications and malignancy assessment.

138 T. Arodz et al.

Fig. 5 A mammogram of a left breast in the medio-lateral oblique (MLO) projection. On this image ananomalous mass has been marked. This mass is circum-scribed and has a well–defined, sharp border. Further-more, no other masses can be identified within thebreast. These findings suggest that this lesion is benign.The image has been taken from the mini-MIAS databaseof mammograms.

2.1. State-of-the-art

Various techniques have been proposed in the liter-ature for enhancing the contrast of digital mammo-grams. These include: techniques based on fractals[5], wavelet transform [6], homogeneity measures[7], and others. The segmentation of micro–calcifications has been done using e.g. morpholog-ical filters [8], multiresolutional analysis [9], andfuzzy logic [10]. Furthermore, for detecting micro–calcification and malignancy assessment, severalclassification algorithms have been used. Theseinclude: neural networks [11], nearest neighbourclassifier [12], multiple expert systems [13], andsupport vector machines [14]. A more detailed sur-vey on the techniques used in the automatic anal-ysis of digital mammograms can be found in [15].

Fig. 6 A mammogram of a right breast in the MLO pro-jection with an anomalous mass marked . The border ofthis mass is not sharp. It is to some degree lobulated.Such ill–defined or lobulated masses are of more seriousconcern than those with sharp borders, especially if thelobulations are large and there are many of them. Thisimage was obtained from the mini-MIAS database.

Recently a number of information technologi-cal initiatives have been undertaken, which aredevoted to the development of CAD systems anddigital mammography. A good example is theNational Digital Mammography Archive project1

[16,17]. This is a collaborative effort between theUniversity of Pennsylvania Medical Center, Univer-sity of Chicago Department of Radiology, Universityof North Carolina – Chapel Hill School of MedicineDepartment of Radiology – Breast Imaging, Sun-nybrook and Women’s College Health SciencesCentre of the University of Toronto, and AdvancedComputing Technologies Division of BWXT Y-12L.L.C. in Oak Ridge Tennessee. Its aim is to developa national archive for breast imaging, which isavailable over the net. Furthermore, a nationalnetwork and cyber–infrastructure devoted to digitalmammography will be created because of the datadeluge to be expected from the enormous numberof high–resolution mammograms at an annualbasis. A Digital Database for Screening Mammog-raphy has been created at the University of SouthFlorida [18]. This voluminous database provideshigh–resolution digitised mammograms for devel-oping easy–to–use cancer detection algorithms and

1 http://nscp.upenn.edu/NDMA

http://nscp.upenn.edu/NDMA


Fig. 7 A mammogram of a right breast in the MLO pro-jection on which a mass has been marked. This exampleis referred to as spiculated mass and usually indicates thepresence of a malignant, invasive tumour. Image from themini-MIAS database.

enabling the comparative analysis of detection ac-curacy. Another database of digital mammograms,i.e. mini-MIAS, is provided by The MammographicImage Analysis Society [19]. Devising an efficientsystem for handling many mammograms, with upto petabytes of accumulated data, is indeed a chal-lenging task in computer science and informationtechnology.

2.2. Motivation

In Ref. [20] we have used two algorithms for facedetection of images, i.e., AdaBoost with simplerectangular features [21] and SVM with log-polarsampling grid [22]. Our goal is to verify whetherthese algorithms are suitable for distinguishing be-tween normal and abnormal regions of breast im-ages. However, the problem of selecting the sus-

picious regions from the entire area of the breastis beyond the scope of this study. We note thatthe algorithms have been adapted directly to thisdomain, i.e., only some parameters have beenchanged but the image feature extraction methodsand the classifiers were the same as in [20].

The algorithms have been evaluated on theDDSM [18] database of mammogram images. Thisdatabase consists of a large set of breast images inboth MLO, i.e., medio-lateral and CC, i.e., cranio-caudial projections. For each patient, four imagesare present, including the above types of pro-jections for the left and right breast. From thedatabase, we have chosen the BCRP MASS 0 andBCRP CALC 0 subsets, including cases of malignantmasses and microcalcifications, respectively. Fromthese images, we have obtained 168 samples ofbreast regions with lesions and 1017 samples ofbreast regions without any lesion. These sampleswere used in the training of the classifiers and eval-uating of classification accuracy.

3. Design considerations

TtoisfTitfag

fimobt(cltqa

t

ti

he design of the microcalcification detection sys-em is based on the learning, classifier-based meth-ds [23]. The general scheme is presented belown Algorithm 1. These methods consist of severalteps. First, a set of rectangular regions is selectedrom each of the available mammogram images.he selected regions are then partitioned into train-ng and testing samples. Afterwards, a set of fea-ures is extracted from each of the samples. Theeatures from training samples are used to train anlgorithm capable of making a binary decision on aiven image features.After the classifier has been trained, the image

eatures from the testing set are used to evaluatets effectiveness. In particular, a 2× 2 confusionatrix [23] can be computed: Ai,j. The element ai,j

f this matrix represents the number of sampleselonging to class i, provided that the classifica-ion algorithm decides that it belongs to class j,i, j ∈ {1− lesion, 0− nonlesion}). Since the lesionan be treated as the positive diagnosis, whileack of lesion in the sample as negative diagnosis,he confusion matrix can lead to four valuesuantifying the behaviour of the classifier. Thesere: true positive ratio specified as the a1,1

(a1,0+a1,1),

rue negative ratio a0,0(a0,0+a0,1)

, false positive ratioa0,1

(a0,0+a0,1)and false negative ratio a1,0

(a1,0+a1,1). The

rue positive ratio is also referred to as sensitiv-ty of the classifier, while the true negative as

140 T. Arodz et al.

Algorithm 1 Detection scheme for suspicious lesions

specificity. Finally, the overall accuracy, or thesuccess rate of the classifier can be quantified as

a1,1+a0,0(a0,0+a0,1+a1,0+a1,1)

. Ideally, the confusion matrixshould have values near 1.0 on the main diagonal,i.e., for true positives and negatives and valuesnear 0 along the second diagonal, i.e., for falsepositives and negatives. These situations indi-cate that the classification algorithm is capableof properly discriminating between regions ofmammograms with lesions or no lesions.

4. System description

As already noted, two classifier-based algorithmsare presented and evaluated in this paper. Theseare the systems based on SVM and on AdaBoost clas-sifiers.

4.1. Treatment of input images

From the DDSM database we have obtained 168 im-age samples with lesions and 1017 without lesions.In order to extract the samples from the original fullbreast DDSM images, we have used the information

of the two breasts. Therefore, we captured rectan-gular regions from the non-lesion breast. To makethe non-lesion and lesion samples most informative,i.e., differing only in the presence of the lesions andnon with e.g. tissue type, we have used the sameregion within the non-cancer breast as in the cancerbreast.

Typically the ratio of non-lesion samples is muchgreater than that of lesion samples. Thus, we haveused as additional non-lesion samples the rectangu-lar regions adjacent to the lesion region from thetop, right, bottom and left, provided they do notcontain another lesion. The same procedure wascarried out for the non-lesion samples in the non-lesion breast. Afterwards we discarded regions con-taining visible artefacts, e.g. mammogram border.

The resulting 168 abnormal, lesion samples, in-cluding 91 microcalcifications and 77 masses, and1017 normal, non-lesion samples were divided intotwo sets:

– training set of 385 samples, including 297 normalsamples and 88 abnormal samples (51 microcalci-fications and 37 masses),

– test set of 800 samples, including 720 normaland 80 abnormal samples (40 calcifications and

stsIi

osiit

on the outline of the lesion. The bounding rectan-gle with margins of 100 pixel was extracted fromeach mammogram that contained a lesion. After-wards, we have estimated the centre of the lesion.Next, each rectangle was cropped to a largest possi-ble square region, with lesion in the centre. Finally,we estimated the diameters of the lesions. Thesetransformations produced a set of square regions ofdifferent sizes, each one containing a lesion in thecentre.

The above procedure allowed us to obtain thelesion samples. For gathering the non-lesion cases,we have used the following procedure. In most pa-tients, the cancerous lesion was present in only one

40 masses).

The number of lesions in the testing set was cho-en to be 10% of the total number of samples in theest set. The exact ratio of normal and abnormalamples used in training depends on the classifier.n some cases, not all non-lesion samples were usedn training.Both of the classifiers operate on a square input

f fixed width. Since the extracted regions can beignificantly larger in diameter than this expectednput, we employ the wavelet approximation of thenput samples. There are two scenarios for choosinghe level of this approximation:


Fixed scaling - each sample is downscaled, usingthe Daubechies–4 wavelet [24] a fixed number oftimes. Then, the central square of a required sizeis extracted. The diameter of the change is notused.Variable scaling - each sample is downscaled, us-ing the Daubechies–4 wavelet. However, the num-ber of successive approximations depends on thediameter of the change. It is chosen so that thediameter of the change is always smaller thanthe classifier window width but larger than a halfof it. Then the central region of the downscaledsample is extracted.

For images of normal breasts, the diameter ofthe change is not valid, since no change is detected.Therefore, in the first scenario, we use the con-stant level of wavelet approximation. In the secondscenario, the level of approximation is randomlyselected from the range of scales found in the sam-ples that contain the cancerous change. Moreover,the distribution of the scales approximates thedistribution of scales for the abnormal samples.

The AdaBoost method uses additional filtering ofinput images to increase the accuracy. As the SVMcpf

iamcofot

4ss

4Tpams

ph

f

wi

Fig. 8 A hyperplane that separates two classes of vec-tors. In dot-product space the hyperplane can be spec-ified as a set of vectors x that satisfies 〈w, x〉 + b = 0.The vectorw is perpendicular to the hyperplane, whereasscalar b

‖w‖ specifies the offset from the beginning of thecoordinate system. The hyperplane depicted on the fig-ure is the one that maximizes the margin to the separatedvectors.

fies the offset from the beginning of the coordinatesystem. An example of the maximal margin hyper-plane given by this form is depicted in Fig. 8. Inorder to allow for a construction of non-linear de-cision boundaries, the separation is performed in afeature space F, which is introduced by a nonlinearmapping ˚ of the input patterns. As the hyperplaneconstruction involves the computation of the innerproducts on the feature space, this mapping ˚ mustsatisfy:

〈˚(x1), ˚(x2)〉 = k(x1, x2) ∀x1, x2 ∈ X (2)

for some kernel function k(·, ·). The kernel func-tion represents the non-linear transformation of theoriginal feature space into the F. Here, xi ∈ X de-notes the input patterns. Maximizing the distanceof this separation is equivalent to the minimizationof the norm-squared: 12‖w‖2 [29]. However, to guar-antee that the resultant hyperplane separates theclasses, the following constraints must be satisfied:

yi · (〈w, xi〉 + b) ≥ 1− i, i ≥ 0, i = 1, . . . n

(3)

where y ∈ {−1, 1} denotes the class label corre-sdutstfi

lassifier uses responses of Gabor filters as an in-ut features, additional filtering is inappropriateor this method.Finally, we use AdaBoost or SVM for classifying

mage windows of fixed width. Therefore, we needn efficient method of moving the window on aammogram. The classifier determines whether aancerous change is present in each of the windowsr not. However, the method for conducting an ef-ective search through the image is out of scopef this paper, as we will focus on the classificationask.

.2. Detection of abnormalities usingupport vector classification and log-polarampling of the Gabor decomposition

.2.1. Support vector classificationhe support vector classification (SVC) was first pro-osed for optical character recognition in [25]. Newlgorithms were also proposed for regression esti-ation [26], novelty detection [27], operator inver-ion [28] and other problems.The SVC algorithm separates the classes of input

atterns with the maximal margin hyperplane. Theyperplane is constructed as:

(x) = 〈w, x〉 + b (1)

here: x is the feature vector, w is the vector thats perpendicular to the hyperplane, and b

‖w‖ speci-

i

ponding to the input pattern xi. These constraintso not impose a strict class separation. Instead, wetilized slack variables i to allow for the training ofhe classifier on linearly non-separable classes. Thelack variables must be penalized in the minimiza-ion term. Consequently, learning of the SVC classi-er is equivalent to solving a minimization problem

142 T. Arodz et al.

with the objective function of the form:

minw ∈ X� ∈ R

n

12‖w‖2 + C

n∑

i=1i (4)

and the constraints are given by Eq. (3). The pa-rameter C controls the penalty for misclassifyingthe training samples (see Ref. [20]). Using the La-grange multiplier technique, we can transform thisoptimization problem to a dual form:

min˛∈Rn

∑ni=1 ˛i − 1

2

n∑

i,j=1˛i˛jyiyj · k(xi, xj)

subject to :0 ≤ ˛i ≤ C

n∑

i=1˛iyi = 0

(5)

In the above formulation, the ˛ = {˛1, ˛2, . . . , ˛n}is the vector of Lagrange multipliers. Furthermore,the feature space dot-products between input pat-terns are computed by using a kernel functionk(·, ·). This is possible, as the mapping ˚ must sat-

have been proposed in [34] as a Gabor elementaryfunction and afterwards extended in [35] to two–dimensional image operators. The two–dimensionalGabor filter is defined as:

G(x, y) = 12��x�y

e−1/2(x2/�2x +y2/�2y )ei2��0x (8)

This filter is a plane sinusoidal wave modulated bya Gaussian and is sensitive to the image details thatwithin the Fourier plane corresponds to the fre-quencies near �0. The �x and �y parameters arewidths of the Gaussian function along the x- andy-axis respectively. As the wave vector of this fil-ter is parallel to the x axis, the filter is sensitive tothe vertical image details. However, for construct-ing a filter sensitive to image details with some ori-entation angle � �= 0, it is sufficient to rotate theoriginal filter from Eq. (8). In [22] the modified Ga-bor filters are employed, which are cast in log-polarcoordinates:

G(r, �) = A e−(r−r0)2/2�2r e−(�−�0)2/2�2� (9)

where:

r = log√

�2 + �2,(10)

Twt

bq(clf

pbeFgbta

4a

Ubisa

isfy Eq. (2). The Lagrange multipliers that solvesthe Eq. (5) can be used to compute the decisionfunction:

f(x) =n∑

i=1˛iyik(xi, x)+ b (6)

where:

b = yi −n∑

j=1˛jyjk(xj, xi) (7)

The solution to Eq. (5) can be found using any gen-eral purpose quadratic programming solver. Fur-thermore, dedicated heuristic methods have beendeveloped that can solve large problems efficiently[30—32].

Finally, we note that the optimization problemfrom Eq. (5) is convex [33]. Therefore, SVC has animportant advantage over neural networks, wherethe existence of many local minima makes thelearning process rather complex, which often leadsto a poor classification. Training of the supportvector classifier is also much more insensitive tooverfitting than the classic feed-forward neuralnetwork.

4.2.2. Log-polar sampling of the GabordecompositionThe Gabor filters are very useful for extractingfeature vectors from images. Originally, the filters

� = arctan�

�

he �r and �� parameters controls the radius-axisidth and angle-axis width of the Gaussian func-ion.This approach is useful in construction of filter

anks with different orientations � and central fre-uencies �0. In particular, the filters defined by Eq.9) do not overlap at low frequencies, whereas theonstruction based on Eq. (8) requires a careful se-ection of �x and �y values for filters with small �0requency.Finally, in [22] a log-polar, spatial grid has been

roposed to sample the responses of a bank of Ga-or filters. It consists of points arranged in sev-ral circles with logarithmically spaced radii (seeig. 9). For computing the feature vector for aiven image point X, the image is filtered with aank of modified Gabor filters and magnitudes ofhe responses are sampled with the grid centredt X.

.3. Boosting method for detectingbnormalities

nlike neural networks [11], the boosting method isased on the idea of combining multiple classifiersnto a single, but much more reliable, classifier. Theet of weak classifiers that contribute to the finalnswer can be simple and erroneous to some ex-


Fig. 9 A grid used to sample the responses of a bankof Gabor filters, as proposed in [22]. To sample the re-sponses for a given point p, the grid is centred in p. Af-terwards, the magnitudes of the complex responses ofthe bank filter are computed at each grid point, the finalresult is a vector, whose coordinates are the magnitudesof filter responses collected from all grid points in a pre-defined order.

tent. However, a scheme for training them has beendevised in order to have a small error in the finalclassification. For additional information concern-ing the method of boosting, the reader is urged toconsult e.g. Ref. [36] or Ref. [37].

From the many boosting algorithms we have fa-vored the AdaBoost classifier [38]. The pseudocodeof this classifier is presented below. This classifieris used also in Ref. [21]. In the training phase, eachsample vector from the training set is weighted. Ini-tially, the weights are uniform for all the vectors.Then, at each iteration, a weak classifier is trainedto minimize a weighted error for the samples. Each

iteration changes the weights values reducing themby an amount, which depends on the error of theweak classifier on the entire training set. However,this reduction is made only for the examples thatwere correctly classified by the classifier trained inthe current iteration. The weight of the weak clas-sifier within the whole ensemble is also connectedto its training error.

Assigning non–uniform, time–varying weightsto the training vectors is crucial for minimizing theerror rate of the final, aggregated classifier. Duringtraining, the ability of the classifier to classifytraining set correctly is constantly increased. Thereason is that weak classifiers used by AdaBoostare complementary. Thus the samples vectorsthat were misclassified by some weak classi-fiers are classified correctly by the other weakclassifiers.

The process of training the classifier is summa-rized in the form of Algorithm 2. In particular, ineach round t of the total T rounds, a weak classi-fier ht is trained on the training set Tr with weightsDt. The training set is formed by examples fromdomain X labelled with labels from a set C. Thetraining of the weak classifier is left to an unspeci-fiedWeakLearner algorithm, which should minimizetw

h

fiivrsttt

e Ad
Algorithm 2 Th
he training error εt of the produced weak classifierith respect to the weights Dt.Based on the error εt of the weak classifier

t, the parameters ˛t and ˇt are calculated. Therst of the parameters defines the weight of ht

n the final, combined classifier. The second pro-ides a multiplicative constant, which is used toeduce the weights {Dt+1(i)} of the correctly clas-ified examples {i}. The weights of the exampleshat were misclassified are not changed. Thus, af-er normalizing the new weights {Dt+1(i)}, the rela-ive weights of the misclassified examples from the

aBoost algorithm

144 T. Arodz et al.

training set are increased. Therefore, in the ht+1round, the WeakLearner is more focused on theseexamples. In this way we have enhanced the chancethat the classifier ht+1 will learn to classify themcorrectly.

The final, strong classifier hfin employs aweighted voting scheme over the results of theweak classifiers ht. The weights of the individualclassifiers are defined by the constants ˛t.

There are two special cases, which are treatedindividually during the algorithm execution. One isthe case of εt equal to zero. In this case, the weightsDt+1 would be equal to Dt, and ht+1 to ht. There-fore, the algorithm does not go further ahead withtraining. The second case is of εt ≥ 0.5. In this case,the theoretical constraints on ht are not satisfied,and the algorithm cannot continue with new roundsof training.

One of the most important issues in using the Ad-aBoost scheme is the choice of the weak classifierthat separates the examples into the two classes tobe discriminated. Following [21], a classifier thatselects a single feature from the entire feature vec-tor is used. The training of the weak classifier con-sists of selecting the best feature and of choosinga threshold value for this feature, which optimally

5. Status report

5.1. Evaluation of the algorithm based onthe SVM classifier

In this section we evaluate the method based on theSVM algorithm previously used for face detection in[20]. The evaluation was carried out on the set ofimages described in Section 4.1.

5.1.1. Parameters of the log-polar sampling gridand the bank of Gabor FiltersThe image feature vectors were extracted usinglog-polar sampling grid composed of 51 points.These points were arranged in 6 circles with theradii spaced logarithmically between 5 and 40points. The bank of Gabor filters used with thesampling grid consisted of 20 filters with a sizeof 85× 85 points. The filters were arranged into4 logarithmically spaced frequency channels and 5uniformly spaced orientation channels. The lowestnormalized frequency presented in the filter bankwas 1

7 , whereas the highest was12 . The orientation

channels cover the entire spectrum, i.e., from 0 to45� radian.

5Tf

––

–

Imnuesip�

fvhflFrsT

sc

separates the examples belonging one class fromexamples belonging to the other class. The selec-tion involves minimizing the weighted error for thetraining set. The feature set consists of the fea-tures, which are computed as differences of thesum of pixels intensities inside two, three or fouradjacent rectangles. These rectangles are of vari-ous sizes and positions within the image window, aslong as their contiguity is maintained. The classifieroperates on an image window of the size 24 × 24pixels.

4.4. Implementation note

The system was implemented in the Matlab ver-sion 6.5 environment. For the SVM classifier weused OSU SVM toolbox version 3.0. The AdaBoosthas been implemented by the authors in Matlabwith some functions implemented as external C li-braries. For scaling of the images, we used MatlabWavelet Toolbox. Image filtering was carried out us-ing the Matlab Image Processing Toolbox. The im-ages were stored in uncompressed, 16 bit per pixel,gray-scale TIFF files. The tests were carried out onthe Sun Microsystems SunBlade 2000 machine run-ning SunOS 5.9 operating system. As the tests didnot require any manual evaluation of images, nospecial presentation or user navigation tools weredeveloped.

.1.2. Training phase and the results of testshe SVM classifier was evaluated independently forollowing three kernel functions:

linear kernel: k(x, y) = 〈x, y〉polynomial kernel of order 4: k(x, y) = (�〈x, y〉 +�)4

Gaussian RBF kernel: k(x, y) = e−‖x−y‖/2�2

n order to select reasonable values for the SVMisclassification penalty and parameters of ker-el functions, we performed a parameter studysing training set with 1-level fixed scaling. Forach of the kernel functions, we evaluated thepecificity of the classifier on the training set us-ng the following values for the misclassificationenalty: C ∈ {1, 2, . . . , 50}. The parameters � andwere set to 1.0 in these tests. Afterwards,

or each of the kernel functions we selected thealue of misclassification penalty that yielded theighest sensitivity. These values were used in allurther tests. Using the same approach we se-ected the values for the parameters � and �.or both parameters we evaluated the followingange of values: �, � ∈ {0.1, 0.2, . . . , 10.0}. The re-ults of the parameter study are summarized inable 1.To obtain the overall classification accuracy,

ensitivity and specificity, we have validated thelassifier on the test set. The results for various


Table 1 Values for parameters of kernel func-tions and misclassification penalties used in the SVMclassifier

Kernelfunction

Parameters of thekernel function

Misclassificationpenalty

Linear – C = 9.0Polynomial � = 0.7 � = 0.5 C = 3.0Gaussian RBF � = 0.9 C = 7.0

combinations of the kernel function and waveletscaling mode are presented in Table 2.

In the tests, the highest sensitivity was achievedwhen using 2-level scaling and Gaussian RBF kernel.The highest specificity and accuracy was achievedwith variable scaling. Also in this case the GaussianRBF kernel performs best.

To evaluate the performance of the SVM classifierfor recognizing particular types of lesions, two ad-ditional tests were performed. In the first test thetraining and testing sets consisted of normal sam-ples and microcalcifications. In the second test weused training and testing set composed of normalsamples and masses. The results of these tests arepresented in Tables 3 and 4 respectively.

The SVM classifier has significantly higher sen-sitivity to microcalcifications than to masses. The

specificity and accuracy is, in general, slightlyhigher for masses. However, the differences inthese two performance measures are small. There-fore we can conclude, that the algorithm performsbetter for microcalcification than for masses.The best sensitivity to microcalcifications wasobtained when using 2-level fixed wavelet scalingwith Gaussian RBF kernel. The best specificity andoverall accuracy was obtained with 1-level scalingand linear kernel. For masses, the highest speci-ficity, sensitivity and accuracy was obtained whenusing variable scaling and Gaussian RBF kernelfunction.

5.2. Evaluation of the AdaBoost-basedclassification algorithm

In this section we evaluate the method based on theAdaBoost algorithm used for face detection in [20].The AdaBoost classifier was trained on the samedatabase as the SVM. The algorithm used central,rectangular part of the images with size equal to24 × 24 pixels.

5Bfi

Table 2 Results for SVM classifier on the testing set of 80

Scaling type Kernel function Sensitivi

Fixed - 1 Linear 52.5Fixed - 1 Polynomial 57.5Fixed - 1 Gaussian RBF 56.2

Fixed - 2 Linear 62.5

f 40

itivi

Fixed - 2 Polynomial 67.5Fixed - 2 Gaussian RBF 68.8

Variable Linear 57.5Variable Polynomial 55.0Variable Gaussian RBF 57.5

Table 3 Results for SVM classifier on the testing set o

Scaling type Kernel function Sens



Variable Linear 67.5
Variable Polynomial 65.0Variable Gaussian RBF 65.0
.2.1. Input image filteringefore we apply the wavelet scaling, the image isltered with various filters. The filters below rep-

lesion and 720 non-lesion breast samples

ty (%) Specificity (%) Accuracy (%)

74.7 72.576.1 74.275.7 73.8

75.3 74.075.0 74.275.4 74.8

82.1 80.182.5 81.083.3 82.0

microcalcifications and 720 non-lesion breast samples

ty (%) Specificity (%) Accuracy (%)

83.3 81.481.2 79.981.5 80.1

77.8 77.276.4 76.276.9 76.7

73.3 73.0
77.9 77.279.4 78.7

146 T. Arodz et al.

Table 4 Results for SVM classifier on the testing set of 40 masses and 720 non-lesion breast samples

Scaling type Kernel function Sensitivity (%) Specificity (%) Accuracy (%)

Fixed - 1 Linear 50.0 81.2 79.6Fixed - 1 Polynomial 50.0 82.5 80.8Fixed - 1 Gaussian RBF 50.0 82.1 80.4

Fixed - 2 Linear 52.5 78.5 77.1Fixed - 2 Polynomial 55.0 78.5 77.2Fixed - 2 Gaussian RBF 55.0 78.5 77.2

Variable Linear 57.5 82.8 80.1Variable Polynomial 55.0 82.5 81.0Variable Gaussian RBF 57.5 83.3 82.0

resent a group of typical basic filters used in imageprocessing. The following filters [39] are used:

No filtering - No filteringUnsharp - Unsharp contrast enhancement filter,i.e., negation of the LaplacianSobel - Sobel horizontal filterLaplacian - Laplacian filterLoG - Laplacian of Gaussian filterDilate - Greyscale dilation using disk structuringelement

5.2.2. Classifier training and results of the testsAfter filtering the wavelet scaling is done and theclassifier is trained. The number of rounds of theclassifier is set to 200. For each different filter-ing type and wavelet scaling mode, a different

classifier is trained. For each configuration, anoverall classification accuracy on the testing set,as well as the sensitivity and specificity of thetrained classifier is given in Table 5. In the test,the Fixed scaling with no filtering achieved thehighest results. The highest sensitivity is achievedfor 4-level scaling. For sensitivity, four levels ofscaling are the best. For overall classifier accu-racy, three levels of wavelet scaling yield betterresults.

To find out how each type of the lesion con-tributed to these results, we have evaluated theclassifier for a training and testing sets including ei-ther only microcalcifications and normal samples,or, in a second scenario, only masses and normalsamples. The results are given in Tables 6 and 7 re-spectively.

Table 5 Results for AdaBoost classifier in various configurations on the testing set of 80 lesion and 720 non-lesionbreast samples

Scaling type Filtering type Sensitivity (%) Specificity (%) Accuracy (%)

Fixed - 3 No filtering 73.8 77.2 76.9Fixed - 3 Unsharp 77.5 74.6 74.9

is spe

Fixed - 3 Sobel 60.0Fixed - 3 Laplacian 48.8Fixed - 3 LoG 51.2Fixed - 3 Dilate 77.5

Fixed - 4 No filtering 82.5Fixed - 4 Unsharp 80.0Fixed - 4 Sobel 70.0Fixed - 4 Laplacian 41.2Fixed - 4 LoG 53.8Fixed - 4 Dilate 75.0

Variable No filtering 68.8Variable Unsharp 66.2Variable Sobel 61.2Variable Laplacian 50.0Variable LoG 52.5Variable Dilate 72.5For fixed wavelet scaling, the level of wavelet approximation

74.0 72.662.5 61.170.7 68.875.8 76.0

73.8 74.674.2 74.874.0 73.678.2 74.578.9 76.477.9 77.6

68.5 68.572.2 71.671.9 70.965.3 63.866.5 65.167.2 67.8

cified.


Table 6 Results for AdaBoost classifier on microcalcifications only for various configurations on the testing set of40 microcalcifications and 720 non-lesion breast samples


Fixed - 3 No filtering 62.5 79.4 78.6Fixed - 3 Unsharp 65.0 73.5 73.0Fixed - 3 Sobel 62.5 65.0 64.9Fixed - 3 Laplacian 57.5 68.2 67.6Fixed - 3 LoG 55.0 61.5 61.2Fixed - 3 Dilate 70.0 76.7 76.3


Variable No filtering 60.0 54.6 54.9Variable Unsharp 40.0 58.2 57.2Variable Sobel 47.5 57.6 57.1Variable Laplacian 45.0 74.9 73.3Variable LoG 35.0 71.8 69.9Variable Dilate 65.0 54.2 54.7For fixed wavelet scaling, the level of wavelet approximation is specified.

In calcification detection, the dilation of the im-age resulted in some improvement of the results.In particular, it increased the sensitivity, as a costof slight decrease in specificity in all three scalingconfigurations.

The AdaBoost classifier obtained significantlybetter results for masses than for microcalcifica-

tions. This can be attributed to the nature of thefeatures used. The features are taken directly fromthe face recognition domain, in which the recog-nized object is of similar size and shape. This typeof object is more similar to masses than to micro-calcification clusters, which are highly non-uniformand not as well localized.

Table 7 Results for masses only for AdaBoost classifier in various configurations on the testing set of 40 massesand 720 non-lesion breast samples




Variable No filtering 92.5 88.5 88.7

is spe

Variable Unsharp 87.5Variable Sobel 70.0Variable Laplacian 72.5Variable LoG 80.0Variable Dilate 85.0For fixed wavelet scaling, the level of wavelet approximation

89.7 89.686.9 86.163.6 64.162.5 63.490.3 90.0

cified.

148 T. Arodz et al.

6. Lessons learned and future plans

In case of the SVM-based approach, our results sug-gest that the algorithm fails to identify the breastimage features that can be used to indicate thepresence of an abnormal tissue. The overall sen-sitivity to abnormal breast regions is below 70%. Aslightly better result was obtained when focussingonly on microcalcifications. Here the highest sensi-tivity was equal to 72.5%. On the other hand, thesensitivity for masses only is below 60%. The speci-ficity of the method is higher and reaches 83.3% inall three types of tests.

The most probable reason behind this behaviouris that the image feature extraction method is notsuitable for classifying mammogram images. Thelog-polar sampling grid was proposed to detect fa-cial landmarks, e.g. eyes, in face images [20]. Fea-ture extraction methods, based on Gabor filtering,are appropriate for these tasks. However, in mam-mogram image classification more localized fea-tures are necessary as indicators of abnormality.Such features would be more adequate for detec-tion of borders of masses or detection of small mi-crocalcifications.

mammograms digitised at high resolution, the clas-sification algorithms are suitable only for the finaldecision on the presence of an abnormality. How-ever, a fast algorithm is needed that would allow usto select suspicious locations within the image andto discard quickly the uninteresting regions.

In the future we should be focussing on the de-velopment of the following procedures based on alocal processing of data:

1. Fast and accurate algorithms for the selectionof suspicious regions within digitised mammo-grams,

2. Image feature extraction algorithms tailored tothe analysis of digitised mammograms,

3. Filters that can accentuate the abnormal tissuein digitised mammograms, as e.g. in [40].

Acknowledgments

We thank Professor Robert Hollebeck, University ofPennsylvania, for his encouragement and sugges-tions and Dr. T. Popiela from Department of Ra-diology, Collegium Medicum, Jagiellonian Univer-stitdM

R

Results for the AdaBoost-based approach arepromising. The classifier accuracy and sensitivityreaches 90% for masses and decreases to about 76%accuracy and sensitivity for all lesion types. This de-crease is caused by inability of the classifier to rec-ognize microcalcifications. The accuracy for onlymicrocalcification lesion type is also 78%, however,the sensitivity is only 70%. The introduction of var-ious filters does not lead to significant changes inthe recognition accuracy.

These results can be treated as the lowest val-ues we can expect from a mammogram detectionalgorithm. The tested algorithms were not origi-nally created with cancer detection in mind. Theyhad been directly applied from the face recogni-tion field, which could be considered as one of thebenchmark problems in the image processing andrecognition field. Therefore, we emphasize herethat any algorithm dedicated for the cancer detec-tion should achieve at least this degree of accuracy,for it to be considered worthwhile.

When there are many candidate regions, our re-sults cannot precisely identify the abnormal lesionsin the breast image. In this case, our algorithmwould select a significant number of normal tissuesamples. Our study also suggests that the image fea-tures used are the major limitation for any furtherincrease of the recognition accuracy.

As already noted, the problem of selecting thesuspicious regions of a mammogram was beyond thescope of this study. Due to the very large size of the

ity, Krakow (Poland) for medical consultations. Wehank Ben Holtzmann and Lilli Yang for contribut-ng to Fig. 1. This research has been supported byhe Math-Geo program of National Science Foun-ation and the Digital Technology Center of Univ.innesota.

eferences

[1] Cancer facts and figures, the American Cancer Society,2003.

[2] T. Freer, M. Ulissey, Screening Mammography withComputer-aided Detection: Prospective Study of 12,860 Pa-tients in a Community Breast Center, Radiology 220 (3)(2001) 781–786.

[3] M. LeGal, G. Chavanne, Valeur diagnostique des microcalci-fications groupees decouvertes par mammographies, Bull.Cancer (71) (1984) 57–64.

[4] A.C. of Radiology, BI-RADS: Mammography, in: Breast Imag-ing Reporting and Data System: BI-RADS Atlas. 4th ed.,American College of Radiology, Reston, Va, 2003.

[5] H. Li, K. Liu, S. Lo, Fractal modeling and segmentation forthe enhancement of microcalcifications in digital mammo-grams, IEEE Trans. Med. Imag. 16 (1997) 785–798.

[6] A. Laine, S. Schuler, J. Fan, W. Huda, Mammographic fea-ture enhancement by multiscale analysis, IEEE Trans. Med.Imag. 13 (1994) 725–740.

[7] H. Cheng,W. Jinglia, S. Xiangjuna, Microcalcification detec-tion using fuzzy logic and scale space approaches, PatternRec. 37 (2) (2004) 363–375.

[8] D. Zhao, M. Shridhar, D. Daut, Morphology on detection ofcalcifications in mammograms, Proc. ICASSP92, 1992, pp.129–132.


[9] T. Netsch, H. Peitgen, Scale-space signatures for the de-tection of clustered microcalcifications in digital mammo-grams, IEEE Trans. Med. Imag. 18 (9) (1999) 774–786.

[10] N. Pandey, Z. Salcic, J. Sivaswamy, Fuzzy logic based mi-crocalcification detection, Proceedings of the IEEE NeuralNetworks for Signal Processing Workshop X, 2000, pp. 662–671.

[11] S. Sehad, S. Desarnaud, A. Strauss, Artificial neural classi-fication of clustered microcalcifications on digitized mam-mograms, Proc. IEEE Int. Conf. Syst. Man Cybernet., 1997,pp. 4217–4222.

[12] T. Bhangale, U. Desai, U. Sharma, An unsupervised schemefor detection of microcalcifications on mammograms, Int.Conf. Image Proc. vol. 1, 2000, pp. 184–187.

[13] L. Cordella, F. Tortorella, M. Vento, Combining experts withdifferent features for classifying clustered microcalcifica-tions in mammograms, International Conference on PatternRecognition, vol. 4, 2000.

[14] I. El-Naqa, Y. Yang, M. Wernick, N. Galatsanos, R.Nishikawa, A support vector machine approach for detec-tion of microcalcifications, IEEE Trans. Med. Imag. 21 (12)(2002) 1552–1563.

[15] H. Cheng, X. Cai, X. Chen, L. Hu, X. Lou, Computer-aideddetection and classification of microcalcifications in mam-mograms: a survey, Pattern Rec. 36 (2003) 2967–2991.

[16] R. Hollebeek, NDMA: Collecting and Organizing a LargeScale Collection of Medical Records, Minnesota Supercom-puting Institute, September 19, 2003.

[17] R. Hollebeek, National digital mammography archive,bIO2003 Bioinformatics Track, Washington DC (2003).

[18] M. Heath, K. Bowyer, D.K.R. Moore, P. Kegelmeyer, The dig-

[26] V. Vapnik, S. Golowich, A.J. Smola, Support vector methodfor function approximation, regression estimation, and sig-nal processing, in: M. Mozer, M. Jordan, T. Petsche (Eds.),Advances in Neural Information Processing Systems 9, MITPress, Cambridge, MA, 1997, pp. 281–287.

[27] B. Scholkopf, J. Platt, J. Shawe-Taylor, A.J. Smola,R.C. Williamson, Estimating the support of a high-dimensional distribution, Neural Comp. 13 (2001) 1443–1471.

[28] A.J. Smola, B. Scholkopf, On a kernel–basedmethod for pat-tern recognition, regression, approximation and operatorinversion, Algorithmica 22 (1998) 211–231 technical Report1064, GMD FIRST, April 1997.

[29] S. Gunn, Support vector machines for classification and re-gression, Technical Report, Dept. of Electronics and Com-puter Science, University of Southampton, Southampton,U.K. (1998). in press.

[30] J. Platt, Fast training of support vector machines usingsequential minimal optimization, in: B. Scholkopf, C.J.Burges, A.J. Smola (Eds.), Advances in Kernel Methods —Support Vector Learning, MIT Press, Cambridge, MA, 1999,pp. 185–220.

[31] S. S. Keerthi, S. K. Shevade, C. Bhattacharyya, K. R. K.Murthy, Improvements to platt’s smo algorithm for svm clas-sifier design, Tech. rep., Dept of CSA, IISc, Bangalore, India(1999).

[32] E. Osuna, R. Freund, F. Girosi, An improved training algo-rithm for support vector machines, in: J. Principe, L. Gile,N. Morgan, E. Wilson (Eds.), Neural Networks for Signal Pro-cessing VII, IEEE, New York, 1998, pp. 276–285.

[33] C.J.C. Burges, D. Crisp, Uniqueness of the SVM solution,
ital database for screening mammography, The Proceedingsof the 5th International Workshop on Digital Mammography,Medical Physics Publishing, Madison, WI, USA, 2000.
[19] J. Suckling, J. Parker, D. Dance, S. Astley, D. Betal, N.Cerneaz, S.-L. Kok, I. Ricketts, J. Savage, E. Stamatakis,P. Taylor, The mammographic image analysis society digitalmammogram database, in: Exerpta Medica. InternationalCongress Series, no. 1069, 1994, 375–378.

[20] T. Arodz, M. Kurdziel, Face recognition from still camera im-ages, Master of Science Thesis, Computer Science, AGH Uni-versity of Science and Technology, Krakow, Poland, 2003.

[21] P. Viola, M.J. Jones, Robust real-time face detection, Int.J. Comput. Vision 57 (2) (2004) 137–154.

[22] F. Smeraldi, O. Carmona, J. Bigun, Saccadic search withGabor features applied to eye detection and real-time headtracking, Image Vision Comp. 18 (4) (2000) 323–329.

[23] R. Duda, P. Hart, D. Stork, Pattern Classification, John Wileyand Sons, 2000.

[24] I. Daubechies, Orthonormal bases of compactly supportedwavelets, Comm. Pure Appl. Math. 41 (1988) 909–996.

[25] C. Cortes, V. Vapnik, Support vector networks, MachineLearning 20 (1995) 273–297.

Advances in Neural Information Processing Systems 12, MITPress, Cambridge, MA, 1999 pp. 223–229.

[34] D. Gabor, Theory of communications, J. Int. Electr. Eng. 93(1946) 427–457.

[35] G. Granlund, In search of a general picture processing op-erator, Comp. Graph. Image Proc. 8 (1978) 155–173.

[36] R. Meir, G. Ratsch, An introduction to boosting and leverag-ing, in: S. Mendelson, A. Smola (Eds.), Advanced Lectureson Machine Learning, 2003, pp. 119–184.

[37] Y. Freund, R. Schapire, A short introduction to boosting, J.Jpn. Soc. Art. Intell. 14 (5) (1999) 771–780.

[38] Y. Freund, R.E. Schapire, A decision-theoretic generaliza-tion of on-line learning and an application to boosting, Eu-roCOLT ’95: Proceedings of the Second European Confer-ence on Computational Learning Theory, Springer-Verlag,1995, pp. 23–37.

[39] R. Gonzales, R. Woods, S. Eddins, Digital Image ProcessingUsing Matlab, Prentice Hall, 2004.

[40] T. Arodz, M. Kurdziel, T.J. Popiela, E.O.D. Sevre, D.A. Yuen,A 3D Visualization System for Computer-Aided MammogramAnalysis, Univ. Minnesota Research Rep. UMSI 2004/181,2004.

patternrecognitiontechniquesforautomatic...

Documents