diagnosis of tuberculosis using matlab based artificial...

6
IJIPA: 3(1), 2012, pp. 37-42 DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL NETWORK Chandrika V. * , Parvathi C.S., and P. Bhaskar Department of Instrumentation Technology, Gulbarga University P.G Centre, Yeragera-584133, Raichur, Karnataka, India E-mail: [email protected] Abstract: This paper deals with the diagnosis of Pulmonary Tuberculosis using MatLab based artificial neural network. The architecture of artificial neural network is three layered (15-9-2) Back Progression. The artificial neural network model with shape features and symptoms are used in diagnosing pulmonary tuberculosis i.e. here we are using two features to train the neural network i.e. Shape and symptoms. The model is designed in such a way that the presence of Tuberculosis (TB) is detected and result is displayed. The whole system is designed on the MatLab 7.0 version platform. The Dicom formatted X-ray images are read by converting them into MATRIX format. Then from these images shape features are extracted. The extracted features are fed to the neural network which is trained before these features are fed. Depending on the features and symptoms of the read image, the neural network detects the presence of TB. The model was applied to the validating sample, with accuracy, sensitivity and specificity at 71.25%, 73.68%, and 69.05% respectively. A Graphical User Interface (GUI) has been developed to read the image, to select the region of interest, to process the image, to plot histogram and finally to display the result whether the patient is having TB or Not. These are the fine features of our model. Keywords: Pulmonary Tuberculosis, Neural Network, Back Propagation, TB Symptoms, X-ray, Artificial Neural Network. * E-mail: [email protected] 1. INTRODUCTION Tuberculosis (TB) is one of the most important public health problems worldwide. There are 9 million new TB cases and nearly 2 million TB deaths each year [1]. Case-finding and the management of pulmonary tuberculosis is an essential target of tuberculosis control programs. However, pulmonary tuberculosis (PT) is becoming more and more of a serious problem, particularly in countries affected by epidemics of human immunodeficiency virus (HIV)-TB co-infection [2]. The diagnosis of PT using prompt and accurate methods is a crucial step in the control of the occurrence and prevalence of TB. However, the diagnosis of PT is quite complex, so there is no unified standard at present. Frequently, there is over diagnosis and missed diagnosis and it is a thorny question in the field of TB control. Some of the methods used earlier are based on distance or pair wise distance measurement, and their performance is around 60% to 65% [3]. Artificial neural network (ANN) is theoretical mathematical model acting like human brain which is one kind of information management system based on the imitation of cerebrum neural network architecture and the function [4]. ANN has the functions of self-learning, the associative memory, and highly parallel, fault-tolerant and formidable non-linearity handling ability [5] and can make rational judgment to complex questions according to obtained knowledge and the experience of handling problems. ANNs have been applied in the fields of signal processing, pattern recognition, quality synthetic evaluation, forecast analysis, etc. [6] This study seeks to develop a diagnostic model of TB that is based on ANN to explore the feasibility of it in TB diagnoses. A GUI program is a graphical based approach to execute the program in a more user friendly way. It contains components such as push buttons, text boxes, radio buttons, pop-up menus, slider etc. with proper labels for easy understanding to a less experienced user. These components help the user to easily understand how to execute or what to do

Upload: phungtu

Post on 14-May-2018

217 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL ...serialsjournals.com/serialjournalmanager/pdf/1345291550.pdf · DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL

IJIPA: 3(1), 2012, pp. 37-42

DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIALNEURAL NETWORK

Chandrika V.*, Parvathi C.S., and P. BhaskarDepartment of Instrumentation Technology, Gulbarga University P.G Centre, Yeragera-584133, Raichur, Karnataka, India

E-mail: [email protected]

Abstract: This paper deals with the diagnosis of Pulmonary Tuberculosis using MatLab based artificialneural network. The architecture of artificial neural network is three layered (15-9-2) Back Progression. Theartificial neural network model with shape features and symptoms are used in diagnosing pulmonarytuberculosis i.e. here we are using two features to train the neural network i.e. Shape and symptoms. Themodel is designed in such a way that the presence of Tuberculosis (TB) is detected and result is displayed.The whole system is designed on the MatLab 7.0 version platform. The Dicom formatted X-ray images areread by converting them into MATRIX format. Then from these images shape features are extracted. Theextracted features are fed to the neural network which is trained before these features are fed. Dependingon the features and symptoms of the read image, the neural network detects the presence of TB. The modelwas applied to the validating sample, with accuracy, sensitivity and specificity at 71.25%, 73.68%, and 69.05%respectively. A Graphical User Interface (GUI) has been developed to read the image, to select the region ofinterest, to process the image, to plot histogram and finally to display the result whether the patient ishaving TB or Not. These are the fine features of our model.Keywords: Pulmonary Tuberculosis, Neural Network, Back Propagation, TB Symptoms, X-ray, ArtificialNeural Network.

* E-mail: [email protected]

1. INTRODUCTION

Tuberculosis (TB) is one of the most importantpublic health problems worldwide. There are 9million new TB cases and nearly 2 million TB deathseach year [1]. Case-finding and the management ofpulmonary tuberculosis is an essential target oftuberculosis control programs. However,pulmonary tuberculosis (PT) is becoming more andmore of a serious problem, particularly in countriesaffected by epidemics of human immunodeficiencyvirus (HIV)-TB co-infection [2]. The diagnosis of PTusing prompt and accurate methods is a crucial stepin the control of the occurrence and prevalence ofTB. However, the diagnosis of PT is quite complex,so there is no unified standard at present.Frequently, there is over diagnosis and misseddiagnosis and it is a thorny question in the field ofTB control. Some of the methods used earlier arebased on distance or pair wise distancemeasurement, and their performance is around 60%to 65% [3].

Artificial neural network (ANN) is theoreticalmathematical model acting like human brain whichis one kind of information management systembased on the imitation of cerebrum neural networkarchitecture and the function [4]. ANN has thefunctions of self-learning, the associative memory,and highly parallel, fault-tolerant and formidablenon-linearity handling ability [5] and can makerational judgment to complex questions accordingto obtained knowledge and the experience ofhandling problems. ANNs have been applied in thefields of signal processing, pattern recognition,quality synthetic evaluation, forecast analysis, etc.[6] This study seeks to develop a diagnostic modelof TB that is based on ANN to explore the feasibilityof it in TB diagnoses.

A GUI program is a graphical based approachto execute the program in a more user friendly way.It contains components such as push buttons, textboxes, radio buttons, pop-up menus, slider etc. withproper labels for easy understanding to a lessexperienced user. These components help the userto easily understand how to execute or what to do

Page 2: DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL ...serialsjournals.com/serialjournalmanager/pdf/1345291550.pdf · DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL

38 Chandrika V. Parvathi C.S. and P. Bhaskar

to execute the program. When an user responds toa GUI's components by pressing a pushbutton orclicking a check box or radio button or by enteringsome text using text box, the program reads thenecessary information for that particular event,hence GUI programs are also known as event drivenprograms. MATLAB provides a tool called GUIDE(GUI Development Environment) for developingGUI programs.

GUI approach is employed in various fields. Insome systems GUI is built to facilitate users to applythe developed system and understand hierarchy.GUI that acts as an intermediate media creates aform of communication between users and thedeveloped object detection system.

2. SUBJECTS AND METHODS

2.1 Methodology

The block diagram of the proposed setup is asshown in Figure 1. Initially the neural network istrained with shape features and symptoms withmore number of X-ray images (data base).

In figure 1, we observe that the data i.e. theimages are read and are converted into matrix format.Then from these images shape features are collected.This stage is called as feature collection. Then thesefeatures are fed to the neural network for testing. Inthis stage the features of testing images are comparedwith the extracted features of the neural networkduring training process. After once the symptoms arefed the neural network fetches the result as TB orNON-TB on the window of the PC. (There is onemore stage here i.e. database to neural network, inthis stage training of the neural network takes placewith symptoms & shape features). A GUI has beendeveloped to perform all the processes i.e. readingthe image, training the neural network, for drawingthe histogram and to display the result. The GUImodel is shown in figure 2.

We compared 27 cases with a final diagnosis ofTB to 28 non-TB cases, for a total of 55 patients inthe modeling sample. 38 TB cases and 42 non-TBcases were used for the validation sample. In themodeling sample were randomly assigned into twogroups, one for training and one for testing in a 4:1ratio, respectively. Using the training sample adoptadvisor study system, we obtained the distinctionfunction through the training. We then distingui-shed the unknown sample category using these

distinction functions, while the testing sample wasused to examine the reliability of the recognitionfunction, which was obtained from the trainingsample. This is how the ANN network architectureand the judgment training end points weredetermined. The 25 cases in the validation samplewere used to evaluate the generalizability of thenetwork. We evaluated network diagnosisperformance using the sensitivity and the specificity.

Figure 1: Block Diagram of the Proposed Setup

Figure 2: Graphical User Interface (GUI)

Figure 3: ANN Training Network

2.2 ANN MODEL DESIGN

Figure 3 shows the ANN model design.

Page 3: DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL ...serialsjournals.com/serialjournalmanager/pdf/1345291550.pdf · DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL

Diagnosis of Tuberculosis Using Matlab Based Artificial Neural Network 39

2.2.1 The Network Type and the Layer

The Back-Propagation Network is a multi-layeredforward feed network for the weight training of non-linear differentiable functions. The BP networkmainly is used for approximation of functions,pattern recognition, classification, the datacompression. In the practical application of ANN,80%-90% of the ANN model adopted the BPnetwork or its variations. The BP network is alsocentral to the forwarding network and constitutesthe most vital element of the ANN. ANN with onehidden layer can be used for approximation for anyclosed interval, continuous function. Therefore, athree-layered (including input layer) BP networkmay complete the random n dimension to mdimension mapping. Therefore this analysis uses athree-layered BP network with one hidden layer.

2.2.2 Input and Output Variable Choice

Training samples were analyzed using single factorLogistic regression, screening significant parametersfor TB diagnosis as input variable. Parametersidentified in this analysis included the shapevariables and symptoms. The network output hastwo kinds: the first kind is the TB group, for whichthe expected export value is 1; the second kind isthe non-TB group, for which the expected exportvalue is 0.

2.2.3 Number of Hidden Level Neurons

Determining the number of hidden layer neuronsis a very complex issue. Because of the lack of astrong analytical formula for calculating this value,in the past, this was often determined simplyaccording to designer’s experience and repeatedtrials. To address this in my research, we designeda BP network with a hidden layer with variableneuron in order to determine best number of hiddenlayer neurons through comparisons of errors.

2.2.4 Activation Function

Activation function is central to both the neuron andthe network. The capacity and efficiency of anetwork to solve questions depend on the activationfunction which used in the network, to a great extentbeside related to the network architecture. TheSigmoid activation function has the function ofnonlinearity magnification to coefficient; it cantransform the signal from an input of –8 to 8, to anoutput of –1 to 1. Because the magnificationcoefficient is smaller for larger input values and

bigger for smaller input values. As such, we choseto use the Sigmoid activation function.

2.2.5 The Pretreatment of Clinic Data

Different parameters used in diagnoses haddifferent expression methods and dimensions, andthere was a significant difference between theirranges. If raw data were directly input into theneural network, the network would adjust weightprimarily in accordance with data whose numericalvalues are greater. So the frequency of error did notreflect the data whose numerical values weresmaller. So raw data had to be changed into thosefit for neural network by means of pretreatment toimprove the learning ability and astringencyfunction of the neural network. It was also importantto normalization, pretreated input data for thenetwork, which used the ‘sigmoid’ excitationfunction and error back-propagation learnalgorithm for raising their learning ability andgeneralization performance. The input data ofnetwork should be in the interval (0, 1)as our neuralnetwork understands values between 0-1, so 1 and0 were used to indicate “YES” and “NO” for thebinary variable data. So we normalize the raw datawhose values vary from 0-255in to binary formwhose values vary from 0-1. This is what is calledas normalization. Normalization treatment is widelyused for selection of quantitative data as follows:

min

max min

i ii

i i

x xy

x x (1)

Where,

xi = raw data

ximin = minimum pixel value of the raw data

ximax = maximum pixel value of the raw data.The data collected were used for the raw data matrixof the ANN diagnostic after they were quantitativeand normalized according to this principle.

3. IMPLEMENTATION METHOD

ANN was implemented with self-edited programusing the Neural-Network-Toolbox in MATLAB 7.0.

3.1 Detecting the Lung Area

To find the lung area and hence the shape of thelung, several stages of processing is applied to theimage. By using a high pass FFT filter and suitablecutoff frequency, the soft tissue within the lungs canbe isolated. The resulting high frequency areas are

Page 4: DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL ...serialsjournals.com/serialjournalmanager/pdf/1345291550.pdf · DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL

40 Chandrika V. Parvathi C.S. and P. Bhaskar

non-uniform, so the filtered image is first dilated toa suitable degree and then eroded to separate theseuniform regions. A binary image is then generatedby thresholding the image. Thresholding canintroduce further non-uniformity to these regionsbut can be rectified by an additional step of dilationand erosion. An approximation of the lung area afterperforming these steps is shown by Flowchart 1.Many of the errors present in the processed image(especially the upper left and right hand corners)can be attributed to the high frequency nature ofthe patient text on the sample radiograph.

3.2 Rib Supression

The density of the ribs affects the image by changingthe luminance values of the underlying textures. Thiscan affect the detection of nodules. A method forsuppressing the contrast of the ribs and chest claviclesmay be implemented using an algorithm such as theone suggested by K. Suzuki, H. Abe, H. MacMahon,K. Doi [7]. The previously suggested method obtainsa representation of the bone structure in a radiographby using a dual-energy subtraction technique. Thisinvolves a multi-kV (multi-kilovolt) radiographicanalysis using two separate radiographs, eachcaptured at a different kV rating. The generated bonestructure is then used to train a classifier and suppressthe ribs in a lung radiograph.

3.3 Shape Description

We discriminated true TB using shape. Matchingnearest-neighbor connected pixels were grouped; toaccount for varying bacillary orientations andmagnification, we bypassed size calibration byemploying two shape descriptors that wereinvariant to rotation, translation, skew transfor-mations and scale: 1) axis ratio (1 for circles, higherfor line segments) and 2) eccentricity, a ratio ofdistance between elliptical foci to major axis length(1 for line segments, 0 for circles). The typical axisratio of 2-2.5 for TB cavity was significantly differentfrom approximately ‘one’ for non-TB objects;similarly, TB eccentricity was 0.90-0.96 and centeredat zero for non-TB. To maximize rod-shaped objectrecognition, we empirically chose conservativethreshold cut-offs (axis-ratio > 1.25 and eccentricity>0.65) as indicating TB. Objects below the thresholdswere labeled red as ‘non-TB’ objects. Calculating themean TB size µ and standard deviation from abroth image, we labeled all size outliers µ ± 1.5 inblue as ‘possible’ and within µ ± 1.5 in green as‘definite’ TB objects.

3.4 Training and Testing

Flowchart 2 and 3 explains the method of trainingand testing. During training the TB and NON-TBimages are read then in the next stage noise removaltakes place in the preprocessing stage then shapefeatures are calculated. These vector values are fedto the neural network. This is repeated for all theimages. This is how the neural network is trained.

Flowchart 1: Flowchart showing the classification steps forautomatic identification and labeling ofTuberculosis.

Flowchart 2: Flowchart Showing the Steps for Training ofDatabase for ANN.

Page 5: DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL ...serialsjournals.com/serialjournalmanager/pdf/1345291550.pdf · DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL

Diagnosis of Tuberculosis Using Matlab Based Artificial Neural Network 41

During testing same procedure is followed liketraining only, i.e. the features of the knowledge baseare compared with the present features of the imageand then the result is classified and displayed.

accordance with TB features. Because we found thatdiagnosis on TB was influenced by 15 variables(Table 1), the input number was 15. We also foundthat the approach effect to function of BP net wasbest when the node number of the hidden layer was9 after many trials. Because output variables werein the form of TB and NO-TB, the BP net frameworkwas 15-9-2. The network was trained by ‘traingdm’with ‘tansig’ activation function for the hidden layer,and ‘purelin’ linearity function for output. In thisprocess, the target error was 0.01 and the biggesttraining time was 1000.

Flowchart 3: Flowchart Showing the Steps for Testing of InputImage

4. RESULTS AND DISCUSSIONS

4.1 Selection Results of The BP NetworkArchitecture

We used a three-layer BP network, including aninput layer, an output layer, and a hidden layer, in

Figure 4: GUI Showing the Result of Testing Image

Above figure 4 shows the result of the givenX-ray image after once the symptoms are added.

Table 1Variable and their Significance

Sl. No Variables Sig.

1. Rectangularity 0.052. Circularity 0.053. Sphericity 0.054. Convexity 0.0255. Convexity Perimeter 0.0256. Cough 0.157. Fever 0.158. Weight Loss 0.159. HIV 0.07510. Breathlessness 0.07511. Family History 0.0612. Past History 0.0613. Alcoholic 0.0414. Chest Pain 0.0315. Smoker 0.01

Page 6: DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL ...serialsjournals.com/serialjournalmanager/pdf/1345291550.pdf · DIAGNOSIS OF TUBERCULOSIS USING MATLAB BASED ARTIFICIAL NEURAL

42 Chandrika V. Parvathi C.S. and P. Bhaskar

4.2 The Results of Training and Fit of BPNetwork

The objective error, and largest training number oftraining samples were 0.01 and 1000 separately(Figure 5).

methods, such as nucleic acid amplification tests isvery high and the effectiveness of these tests has notbeen confirmed in developing countries. To aimdirectly at uncertainty information and artifacts inclinical diagnosis, the limitation of regressionmodeling can be overcome by the use of ANNs.

Reasonable judgment, satisfactory predictionsand ideal forecasts can be achieved by ANN basedon existing knowledge and experiences in solvingproblems. It was confirmed that the sensitivity,specificity of TB diagnosis were 73.68%, and 69.05%,respectively by the (15-9-2)-BP network. Theseresults indicate that the validity of diagnosis wasgood and the (15-9-2)-BP network could be furtherextended to new patient data. The results indicatethat this could be used as a new diagnosis methodfor this complex problem.

References

[1] R.P. Tripathi, N. Tewari, N. Dwivedi, et al.“Fighting Tuberculosis: An Old Disease with NewChallenges”. Med Res Rev, 2005, 25(1), 93-131.

[2] R. Colebunders, WE. Bastian. “A Review ofDiagnosis and Treatment of Smear-negativePulmonary Tuberculosis”. Int. J. Tuberc Lung Dis,2000, 4, 97-107.

[3] S.A. Patil, and V.R. Udupi, “Textile andEngineering Institute”, Ichalkaranji, India Chest X-ray Features Extraction for Lung Cancer Classification,JSIR, 69, April 2010, pp. 271-277.

[4] Y.J. Wu, Y.M. Wu, L.B. Qu, et al. Application ofArtificial Neural Network in the Diagnosis of LungCancer”. Chin J. Microbiol Immuno, 2003, 23(8), 646-649.

[5] F.E. Ahmed. “Artificial Neural Networks forDiagnosis and Survival Prediction in ColonCancer”. Molecular Cancer, 2005, 4, 29.

[6] W. Deng, P.H. Jin. “Artificial Neural Networks andIts Applications in Preventive Medicine”. Chin PubHealth, 2002, 18(10), 1265-1267.

[7] K. Suzuki, H. Abe, H. MacMahon, K. Doi, “Image-Processing Technique for Suppressing Ribs in ChestRadiographs by Means of Massive Training ArtificialNeural Network (MTANN)”, IEEE Transactions onMedical Imaging, 25(4), pp. 406-416, 2006.

Figure 5: Semi-Logarithmic Line Graph of TrainingPerformance

According to Table 2 Accurate rate, Sensitivity,and Specificity of (15-9-2) BP network diagnosiswere 71.25% (57/80), 73.68% (28/38), and 69.05%(29/42), respectively.

Table 2Diagnostic Result of Testing Samples

Diagnostic Status of disease Totalresult TB Non-TB

TB 28 13 41

Non-TB 10 29 39

Total 38 42 80

5. CONCLUSION

Due to the complexity of TB diagnosis, there continuesto be no unified standard for it. Over diagnosis andmissed diagnosis are formidable problems in theprocess for TB control. The cost of new diagnostic