research article training feedforward neural networks using...

15
Research Article Training Feedforward Neural Networks Using Symbiotic Organisms Search Algorithm Haizhou Wu, 1 Yongquan Zhou, 1,2 Qifang Luo, 1,2 and Mohamed Abdel Basset 3 1 College of Information Science and Engineering, Guangxi University for Nationalities, Nanning 530006, China 2 Key Laboratory of Guangxi High Schools Complex System and Computational Intelligence, Nanning 530006, China 3 Faculty of Computers and Informatics, Zagazig University, Zagazig, Egypt Correspondence should be addressed to Yongquan Zhou; [email protected] Received 25 September 2016; Accepted 16 November 2016 Academic Editor: Leonardo Franco Copyright © 2016 Haizhou Wu et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Symbiotic organisms search (SOS) is a new robust and powerful metaheuristic algorithm, which stimulates the symbiotic interaction strategies adopted by organisms to survive and propagate in the ecosystem. In the supervised learning area, it is a challenging task to present a satisfactory and efficient training algorithm for feedforward neural networks (FNNs). In this paper, SOS is employed as a new method for training FNNs. To investigate the performance of the aforementioned method, eight different datasets selected from the UCI machine learning repository are employed for experiment and the results are compared among seven metaheuristic algorithms. e results show that SOS performs better than other algorithms for training FNNs in terms of converging speed. It is also proven that an FNN trained by the method of SOS has better accuracy than most algorithms compared. 1. Introduction Artificial neural networks (ANNs) [1] are mathematical models and have been widely utilized for modeling complex nonlinear processes. As one of the powerful tools, ANNs have been employed in various fields, like time series prediction [2], classification [3, 4], pattern recognition [5–7], system identification and control [8], function approximation [9], signal processing [10], and so on [11, 12]. ere are different types of ANNs proposed in the liter- ature: feedforward neural networks (FNNs) [13], Kohonen self-organizing network [14], radial basis function (RBF) [15–17], recurrent neural network [18], and spiking neural networks [19]. In fact, feedforward neural networks are the most popular neural networks in practical applications. Training process is one of the most important aspects for neural networks. In this process, the goal is to achieve the minimum cost function defined as a mean squared error (MSE) or a sum of squared error (SSE) by the means of finding the best combination of connection weights and biases. In general, training algorithms can be classified into two groups: gradient-based algorithms versus stochastic search algorithms. e most widely applied gradient-based training algorithms are backpropagation (BP) algorithm [20] and its variants [21]. However, in complex nonlinear problems, these two algorithms suffer from some shortcomings, such as highly depending on the initial solution, which subsequently impact on the convergence of the algorithm and easily get trapped into local optima. On the other hand, stochastic search methods like metaheuristic algorithms were proposed by researchers as alternatives to gradient-based methods for training FNNs. Metaheuristic algorithms are proved to be more efficient in escaping from local minima for optimization problems. Various metaheuristic optimization methods have been used to train FNNs. GA, inspired by Darwinians’ theory of evolution and natural selection [22], is one of the earliest methods for training FNNs that proposed by Montana and Davis [23]. e results indicate that GA is able to outperform BP when solving real and challenging problems. Shaw and Kinsner presented a method called chaotic simulated anneal- ing [24, 25], which is superior in escaping from local optima for training multilayer FNNs. Zhang et al. proposed a hybrid particle swarm optimization-backpropagation algorithm for Hindawi Publishing Corporation Computational Intelligence and Neuroscience Volume 2016, Article ID 9063065, 14 pages http://dx.doi.org/10.1155/2016/9063065

Upload: others

Post on 26-May-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Research ArticleTraining Feedforward Neural Networks Using SymbioticOrganisms Search Algorithm

Haizhou Wu1 Yongquan Zhou12 Qifang Luo12 and Mohamed Abdel Basset3

1College of Information Science and Engineering Guangxi University for Nationalities Nanning 530006 China2Key Laboratory of Guangxi High Schools Complex System and Computational Intelligence Nanning 530006 China3Faculty of Computers and Informatics Zagazig University Zagazig Egypt

Correspondence should be addressed to Yongquan Zhou yongquanzhou126com

Received 25 September 2016 Accepted 16 November 2016

Academic Editor Leonardo Franco

Copyright copy 2016 Haizhou Wu et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Symbiotic organisms search (SOS) is a new robust andpowerfulmetaheuristic algorithmwhich stimulates the symbiotic interactionstrategies adopted by organisms to survive and propagate in the ecosystem In the supervised learning area it is a challenging taskto present a satisfactory and efficient training algorithm for feedforward neural networks (FNNs) In this paper SOS is employedas a newmethod for training FNNs To investigate the performance of the aforementionedmethod eight different datasets selectedfrom the UCI machine learning repository are employed for experiment and the results are compared among seven metaheuristicalgorithms The results show that SOS performs better than other algorithms for training FNNs in terms of converging speed It isalso proven that an FNN trained by the method of SOS has better accuracy than most algorithms compared

1 Introduction

Artificial neural networks (ANNs) [1] are mathematicalmodels and have been widely utilized for modeling complexnonlinear processes As one of the powerful tools ANNs havebeen employed in various fields like time series prediction[2] classification [3 4] pattern recognition [5ndash7] systemidentification and control [8] function approximation [9]signal processing [10] and so on [11 12]

There are different types of ANNs proposed in the liter-ature feedforward neural networks (FNNs) [13] Kohonenself-organizing network [14] radial basis function (RBF)[15ndash17] recurrent neural network [18] and spiking neuralnetworks [19] In fact feedforward neural networks arethe most popular neural networks in practical applicationsTraining process is one of the most important aspects forneural networks In this process the goal is to achieve theminimum cost function defined as a mean squared error(MSE) or a sumof squared error (SSE) by themeans of findingthe best combination of connection weights and biasesIn general training algorithms can be classified into twogroups gradient-based algorithms versus stochastic search

algorithms The most widely applied gradient-based trainingalgorithms are backpropagation (BP) algorithm [20] andits variants [21] However in complex nonlinear problemsthese two algorithms suffer from some shortcomings such ashighly depending on the initial solution which subsequentlyimpact on the convergence of the algorithm and easily gettrapped into local optima On the other hand stochasticsearch methods like metaheuristic algorithms were proposedby researchers as alternatives to gradient-based methods fortraining FNNs Metaheuristic algorithms are proved to bemore efficient in escaping from localminima for optimizationproblems

Various metaheuristic optimization methods have beenused to train FNNs GA inspired by Darwiniansrsquo theory ofevolution and natural selection [22] is one of the earliestmethods for training FNNs that proposed by Montana andDavis [23] The results indicate that GA is able to outperformBP when solving real and challenging problems Shaw andKinsner presented a method called chaotic simulated anneal-ing [24 25] which is superior in escaping from local optimafor training multilayer FNNs Zhang et al proposed a hybridparticle swarm optimization-backpropagation algorithm for

Hindawi Publishing CorporationComputational Intelligence and NeuroscienceVolume 2016 Article ID 9063065 14 pageshttpdxdoiorg10115520169063065

2 Computational Intelligence and Neuroscience

feedforward neural network training In their research aheuristic way was adopted to give a transition from particleswarm search to gradient descending search [26] In 2012Mirjalili et al proposed a hybrid particle swarm optimization(PSO) and gravitational search algorithm (GSA) [27] to trainFNNs [28] The results showed that PSOGSA outperformsbothPSOandGSA in terms of converging speed and avoidinglocal optima and has better accuracy thanGSA in the trainingprocess In 2014 a new metaheuristic algorithm called cen-tripetal accelerated particle swarm optimization (CAPSO)was employed by Beheshti et al to evolve the accuracy intraining ANN [29] Recently several other metaheuristicalgorithms are applied on the research of NNs In 2014Pereira et al introduced social-spider optimization (SSO) toimprove the training phase of ANN with multilayer percep-trons and validated the proposed approach in the context ofParkinsonrsquos disease recognition [30] Uzlu et al applied theANN model with the teaching-learning-based optimization(TLBO) algorithm to estimate energy consumption in Turkey[31] In 2016 Kowalski and Łukasik invited the krill herd algo-rithm (KHA) for learning an artificial neural network (ANN)which has been verified for the classification task [32] In 2016Faris et al employed the recently proposed nature-inspiredalgorithm called multiverse optimizer (MVO) for trainingthe feedforward neural network The comparative studydemonstrates thatMVO is very competitive and outperformsother training algorithms in the majority of datasets [33]Nayak et al proposed a firefly based higher order neuralnetwork for data classification for maintaining fast learningand avoids the exponential increase of processing units[34] Many other metaheuristic algorithms like ant colonyoptimization (ACO) [35 36] Cuckoo Search (CS) [37]Artificial Bee Colony (ABC) [38 39] Charged System Search(CSS) [40] Grey Wolf Optimizer (GWO) [41] InvasiveWeed Optimization (IWO) [42] and Biogeography-BasedOptimizer (BBO) [43] have been adopted for the research ofneural network

In this paper a new method of symbiotic organismssearch (SOS) is used for training FNNs Symbiotic organismssearch [44] proposed by Cheng and Prayogo in 2014 is anew swarm intelligence algorithm simulating the symbioticinteraction strategies adopted by organisms to survive andpropagate in the ecosystem And the algorithm has beenapplied to resolve some engineering design problems byscholars In 2016 Cheng et al researched on optimizingmultiple-resources leveling inmultiple projects using discretesymbiotic organisms search [45] Eki et al applied SOS tosolve the capacitated vehicle routing problem [46] Prasadand Mukherjee have used SOS for optimal power flow ofpower system with FACTS devices [47] Abdullahi et alproposed SOS-based task scheduling in cloud computingenvironment [48] Verma et al investigated SOS for conges-tion management in deregulated environment [49] Time-cost-labor utilization tradeoff problem was solved by Tran etal using this algorithm [50] Recently in 2016more andmorescholars get interested in the research of the SOS algorithmYu et al applied two solution representations to transformSOS into an applicable solution approach for the capacitatedvehicle and then apply a local search strategy to improve

the solution quality of SOS [51] Panda and Pani presentedhybrid SOS algorithm with adaptive penalty function tosolve multiobjective constrained optimization problems [52]Banerjee and Chattopadhyay presented a novelmodified SOSto design an improved three-dimensional turbo code [53]Das et al used SOS to determine the optimal size and locationof distributed generation (DG) in radial distribution network(RDN) for the reduction of network loss [54] Dosoglu etal utilized SOS for economicemission dispatch problem inpower systems [55]

The structure of this paper is organized as followsSection 2 gives a brief description of feedforward neuralnetwork Section 3 elaborates the symbiotic organisms Searchand Section 4 describes the SOS-based trainer and how it canbe used for training FNNs in detail In Section 5 series ofcomparison experiments are conducted our conclusion willbe given in Section 6

2 Feedforward Neural Network

In the artificial neural network the feedforward neuralnetwork (FNN) was the simplest type which consists of a setof processing elements called ldquoneuronsrdquo [33] In this networkthe information moves in only one direction forward fromthe input layer through the hidden layer and to the outputlayerThere are no cycles or loops in the network An exampleof a simple FNN with a single hidden layer is shown inFigure 1 As shown each neuron computes the sum of theinputs weight at the presence of a bias and passes this sumthrough an activation function (like sigmoid function) so thatthe output is obtained This process can be expressed as (1)and (2)

ℎ119895 = 119877sum119894=1

iw119895119894119909119894 + hb119895 (1)

where iw119895119894 is the weight connected between neurons 119894 =(1 2 119877) and 119895 = (1 2 119873) hb119895 is a bias in hidden layer119877 is the total number of neurons in input layer and 119909119894 is thecorresponding input data

Here the S-shaped curved sigmoid function is used as theactivation function which is shown in

119891(119909) = 11 + 119890minus119909 (2)

Therefore the output of the neuron in hidden layer canbe described as in

ho119895 = 119891119895 (ℎ119895) = 1(1 + 119890minusℎ119895) (3)

In the output layer the output of the neuron is shown in

119910119896 = 119891119896( 119873sum119895=1

hw119896119895ho119895 + ob119896) (4)

where hw119895119894 is the weight connected between neurons 119895 =(1 2 119873) and 119896 = (1 2 119878) ob119896 is a bias in output layer

Computational Intelligence and Neuroscience 3

Input layer Output layerHidden layer

x1

x2

xR

iw11

iw12 iw1R

iwN1iwN2

iwNR

hb1

ho1

hb2

hb3

hbN

hoN

hw11

hw12

hw13

hw1N

hwS1

hwS3

hwSN

ob1

obS

y1

yS

Figure 1 A feedforward network with one hidden layer

119873 is the total number of neurons in hidden layer and 119878 is thetotal number of neurons in output layer

The training process is carried out to adjust the weightsand bias until some error criterion is met Above all oneproblem is to select a proper training algorithm Also it isvery complex to design the neural network because manyelements affect the performance of training such as thenumber of neurons in hidden layer interconnection betweenneurons and layer error function and activation function

3 Symbiotic Organisms Search Algorithm

Symbiotic organisms search [44] stimulates symbiotic inter-action relationship that organisms use to survive in theecosystem Three phases mutualism phase commensalismphase and parasitism phase stimulate the real-world biolog-ical interaction between two organisms in ecosystem

31 Mutualism Phase Organisms engage in a mutuality rela-tionship with the goal of increasing mutual survival advan-tage in the ecosystem New candidate organisms for 119883119894 and119883119895 are calculated based on the mutuality symbiosis betweenorganism119883119894 and119883119895 which is modeled in (5) and (6)

119883119894new = 119883119894+ 120572 (119883best minusMutual Vector lowast BF1) (5)

119883119895new = 119883119895+ 120573 (119883best minusMutual Vector lowast BF2) (6)

Mutual Vector = (119883119894 + 119883119895)2 (7)

where BF1 and BF2 are benefit factors that are determinedrandomly as either 1 or 2 These factors represent partially orfully level of benefit to each organism 119883best represents thehighest degree of adaptation organism 120572 and 120573 are randomnumber in [0 1] In (7) a vector called ldquoMutual Vectorrdquorepresents the relationship characteristic between organisms119883119894 and11988311989532 Commensalism Phase One organism obtains benefit anddoes not impact the other in commensalism phase Organism119883119895 represents the one that neither benefits nor suffers fromthe relationship and the new candidate organism of119883119894 is cal-culated according to the commensalism symbiosis betweenorganisms119883119894 and119883119895 which is modeled in

119883119894new = 119883119894 + 120575 (119883best minus 119883119895) (8)

where 120575 represents a random number in [minus1 1] And119883best isthe highest degree of adaptation organism

33 Parasitism Phase One organism gains benefit butactively harms the other in the parasitism phase An artificialparasite called ldquoParasite Vectorrdquo is created in the search spaceby duplicating organism119883119894 and thenmodifying the randomlyselected dimensions using a randomnumber Parasite Vectortries to replace another organism 119883119895 in the ecosystemAccording to Darwinrsquos evolution theory ldquoonly the fittestorganisms will prevailrdquo if Parasite Vector is better it willkill organism 119883119895 and assume its position else 119883119895 will haveimmunity from the parasite and the Parasite Vector will nolonger be able to live in that ecosystem

4 Computational Intelligence and Neuroscience

x1 middot middot middot x(NlowastR) x(NlowastR+1) middot middot middot x(NlowastR+SlowastR) x(NlowastR+SlowastR+1) middot middot middot x(NlowastR+SlowastR+N) x(NlowastR+SlowastR+N+1) middot middot middot x(NlowastR+SlowastR+N+S)

Input weights (iw) Hidden weights (hw) Hidden biases (hb) Output biases (ob)

Figure 2 The vector of training parameters

4 SOS for Train FNNs

In this paper symbiotic organisms search is used as a newmethod to train FNNs The set of weights and bias issimultaneously determined by SOS in order to minimize theoverall error of one FNN and its corresponding accuracy bytraining the network This means that the structure of theFNN is fixed Figure 3 shows the flowchart of trainingmethodSOS which is started by collecting normalizing and readinga dataset Once a network has been structured for a particularapplication including setting the desired number of neuronsin each layer it is ready for training

41 The Feedforward Neural Networks Architecture Whenimplementing a neural network it is necessary to determinethe structure based on the number of layers and the numberof neurons in the layers The larger the number of hiddenlayers and nodes the more complex the network will be Inthis work the number of input and output neurons in MLPnetwork is problem-dependent and the number of hiddennodes is computed on the basis of Kolmogorov theorem [56]Hidden = 2 times Input + 1 When using SOS to optimize theweights and bias in network the dimension of each organismis considered as119863 shown in

119863 = (Input timesHidden) + (Hidden timesOutput)+Hiddenbias +Outputbias (9)

where Input Hidden and Output refer to the number ofinput hidden and output neurons of FNN respectively AlsoHiddenbias and Outputbias are the number of biases in hiddenand output layers

42 Fitness Function In SOS every organism is evaluatedaccording to its status (fitness) This evaluation is done bypassing the vector of weights and biases to FNNs then themean squared error (MSE) criterion is calculated based on theprediction of the neural network using the training datasetThrough continuous iterations the optimal solution is finallyachieved which is regarded as the weights and biases of aneural network The MSE criterion is given in (10) where119910 and are the actual and the estimated values based onproposed model and 119877 is the number of samples in thetraining dataset

MSE = 1119877119877sum119894=1

(119910 minus )2 (10)

43 Encoding Strategy According to [57] the weights andbiases of FNNs for every agent in evolutionary algorithms canbe encoded and represented in the form of vector matrix orbinary In this work the vector encoding method is utilizedAn example of this encoding strategy for FNN is provided asshown in Figure 2

During the initialization process 119883 = (1198831 1198832 119883119873)is set on behalf of the N organisms Each organism 119883119894 =iw hw hb ob (119894 = 1 2 119873) represents complete set ofFNN weights and biases which is converted into a singlevector of real number

44 Criteria for Evaluating Performance Classification isused to understand the existing data and to predict howunseen data will behave In other words the objective ofdata classification is to classify the unseen data in differentclasses on the basis of studying the existing data For theclassification problem in addition toMSE criterion accuracyrate was usedThis ratemeasures the ability of the classifier byproducing accurate results which can be computed as follows

Accuracy = 119873 (11)

where represents the number of correctly classified objectsby the classifier and119873 is the number of objects in the dataset

5 Simulation Experiments

This section presents a comprehensive analysis to investigatethe efficiency of the SOS algorithm for training FNNsAs shown in Table 1 eight datasets are selected from UCImachine learning repository [58] to evaluate the performanceof SOS And six metaheuristic algorithms including BBO[43] CS [37] GA [23] GSA [27 28] PSO [28] and MVO[33] are presented for a reliable comparison

51 Datasets Design The Blood dataset contains 748 instan-ces which were selected randomly from the donor databaseof Blood Transfusion Service Center in Hsinchu City inTaiwan As a binary classification problem the output classvariable of the dataset represents whether the person donatedblood in a time period (1 stands for donating blood 0 standsfor not donating blood) And the input variables are Recencymonths since last donation Frequency total number ofdonation Monetary total blood donated in cc and Timemonths since first donation [59]

Computational Intelligence and Neuroscience 5

(1) Read in datasets

calculate fitness values of the modified organisms

Are the modified organismsfitter than the previous

Keep the previous organisms Accept the modified organisms

Is the modified organism fitter than theprevious

Keep the previous organism Accept the modified organism

Are termination criteriaachieved

Optimal solution

Yes

No

No

No

Yes

Yes

Yes

No

(4) Mutualism phase

(5) Commensalism phase

(6) Parasitism phase

(2) Ecosystem initialization

Number of organisms termination criteria number of trainingand testing examples number of input and output neurons and

calculate the number of hidden neurons based on inputneurons initial organisms in ecosystem

(3) Identify best organism Xbest

Randomly select one organism Xj where Xj Xi

Determine mutual relationship vector (mutual_vector) and benefit factor (BF)

Modify organisms Xi and Xj based on their mutual relationship

Randomly select one organism Xj where Xj Xi

Modify organism Xj with the help of Xi and calculate fitness value of the modified organism

Randomly select one organism Xj where Xj Xi

Create a parasite (parasite_vector) from organism Xi and calculate its fitness value

Keep Xj and remove parasite_vector Replace Xj with parasite_vector

fitter than XjIs parasite vector

Figure 3 Flowchart of SOS algorithm

6 Computational Intelligence and Neuroscience

Table 1 Description of datasets

Dataset Attribute Class Training sample Testing sample Input Hidden OutputBlood 4 2 493 255 4 9 2Balance Scale 4 3 412 213 4 9 3Habermanrsquos Survival 3 2 202 104 3 7 2Liver Disorders 6 2 227 118 6 13 2Seeds 7 3 139 71 7 15 3Wine 13 3 117 61 13 27 3Iris 4 3 99 51 4 9 3Statlog (Heart) 13 2 178 92 13 27 2

The Balance Scale dataset is generated to model psycho-logical experiments reported by Siegler [60] This datasetcontains 625 examples and each example is classified ashaving the balance scale tip to the right and tip to the leftor being balanced The attributes are the left weight the leftdistance the right weight and the right distance The correctway to find the class is the greater of (left distance lowast leftweight) and (right distance lowast right weight) If they are equalit is balanced

Habermanrsquos Survival dataset contains cases from a studythat was conducted between 1958 and 1970 at the Universityof Chicagorsquos Billings Hospital on the survival of patients whohad undergone surgery for breast cancerThedataset contains306 cases which record two survival status patients with ageof patient at time of operation patientrsquos year of operation andnumber of positive axillary nodes detected

The Liver Disorders dataset was donated by BUPAMedi-cal Research Ltd to record the liver disorder status in termsof a binary label The dataset includes values of 6 featuresmeasured for 345 male individuals The first 5 features areall blood tests which are thought to be sensitive to liverdisorders that might arise from excessive alcohol consump-tion These features are Mean Corpuscular Volume (MCV)alkaline phosphatase (ALKPHOS) alanine aminotransferase(SGPT) aspartate aminotransferase (SGOT) and gamma-glutamyl transpeptidase (GAMMAGT) The sixth feature isthe number of alcoholic beverage drinks per day (DRINKS)

The Seeds dataset consists of 210 patterns belonging tothree different varieties of wheat Kama Rosa and CanadianFrom each species there are 70 observations for area 119860perimeter 119875 compactness 119862 (119862 = 4 lowast 119901119894 lowast 1198601198752) length ofkernel width of kernel asymmetry coefficient and length ofkernel groove

The Wine dataset contains 178 instances recording theresults of a chemical analysis of wines grown in the sameregion in Italy but derived from three different cultivars Theanalysis determined the quantities of 13 constituents found ineach of the three types of wines

The Iris dataset contains 3 species of 50 instances eachwhere each species refers to a type of Iris plant (setosaversicolor and virginica) One species is linearly separablefrom the other 2 and the latter are not linearly separablefrom each other Each of the 3 species is classified by threeattributes sepal length sepal width petal length and petalwidth in cm This dataset was used by Fisher [61] in hisinitiation of the linear-discriminate-function technique

The Statlog (Heart) dataset is a heart disease databasecontaining 270 instances that consist of 13 attributes agesex chest pain type (4 values) resting blood pressure serumcholesterol inmgdL fasting blood sugargt 120mgdL restingelectrocardiographic results (values 0 1 and 2) maximumheart rate achieved exercise induced angina oldpeak = STdepression induced by exercise relative to rest the slope ofthe peak exercise ST segment number of major vessels (0ndash3)colored by fluoroscopy and thal 3 = normal 6 = fixed defect7 = reversible defect

52 Experimental Setup In this section the experimentswere done using a desktop computer with a 330GHz Intel(R)Core(TM) i5 processor 4GB of memory The entire algo-rithm was programmed inMATLAB R2012aThementioneddatasets are partitioned into 66 for training and 34 fortesting [33] All experiments are executed for 20 differentruns and each run includes 500 iterations The populationsize is considered as 30 and other control parameters of thecorresponding algorithms are given below

In CS the possibility of eggs being detected andthrown out of the nest is 119901119886 = 025In GA crossover rate 119875119862 = 05 mutate rate 119875119888 = 005In PSO the parameters are set to 1198621 = 1198622 = 2 weightfactor 119908 decreased linearly from 09 to 05

In BBO mutation probability 119901119898 = 01 the value forboth max immigration (119868) and max emigration (119864) is1 and the habitat modification probability 119901ℎ = 08In MVO exploitation accuracy is set to 119901 = 6 themin traveling distance rate is set to 02 and the maxtraveling distance rate is set to 1

In GSA 120572 is set to 20 the gravitational constant (1198660)is set to 1 and initial values of acceleration and massare set to 0 for each particle

All input features are mapped onto the interval of [minus1 1]for a small scale Here we apply min-max normalization toperform a linear transformation on the original data as givenin (12) where V1015840 is the normalized value of V in the range[minmax]

V1015840 = 2 lowast V minusminmaxminusmin

minus 1 (12)

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 2: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

2 Computational Intelligence and Neuroscience

feedforward neural network training In their research aheuristic way was adopted to give a transition from particleswarm search to gradient descending search [26] In 2012Mirjalili et al proposed a hybrid particle swarm optimization(PSO) and gravitational search algorithm (GSA) [27] to trainFNNs [28] The results showed that PSOGSA outperformsbothPSOandGSA in terms of converging speed and avoidinglocal optima and has better accuracy thanGSA in the trainingprocess In 2014 a new metaheuristic algorithm called cen-tripetal accelerated particle swarm optimization (CAPSO)was employed by Beheshti et al to evolve the accuracy intraining ANN [29] Recently several other metaheuristicalgorithms are applied on the research of NNs In 2014Pereira et al introduced social-spider optimization (SSO) toimprove the training phase of ANN with multilayer percep-trons and validated the proposed approach in the context ofParkinsonrsquos disease recognition [30] Uzlu et al applied theANN model with the teaching-learning-based optimization(TLBO) algorithm to estimate energy consumption in Turkey[31] In 2016 Kowalski and Łukasik invited the krill herd algo-rithm (KHA) for learning an artificial neural network (ANN)which has been verified for the classification task [32] In 2016Faris et al employed the recently proposed nature-inspiredalgorithm called multiverse optimizer (MVO) for trainingthe feedforward neural network The comparative studydemonstrates thatMVO is very competitive and outperformsother training algorithms in the majority of datasets [33]Nayak et al proposed a firefly based higher order neuralnetwork for data classification for maintaining fast learningand avoids the exponential increase of processing units[34] Many other metaheuristic algorithms like ant colonyoptimization (ACO) [35 36] Cuckoo Search (CS) [37]Artificial Bee Colony (ABC) [38 39] Charged System Search(CSS) [40] Grey Wolf Optimizer (GWO) [41] InvasiveWeed Optimization (IWO) [42] and Biogeography-BasedOptimizer (BBO) [43] have been adopted for the research ofneural network

In this paper a new method of symbiotic organismssearch (SOS) is used for training FNNs Symbiotic organismssearch [44] proposed by Cheng and Prayogo in 2014 is anew swarm intelligence algorithm simulating the symbioticinteraction strategies adopted by organisms to survive andpropagate in the ecosystem And the algorithm has beenapplied to resolve some engineering design problems byscholars In 2016 Cheng et al researched on optimizingmultiple-resources leveling inmultiple projects using discretesymbiotic organisms search [45] Eki et al applied SOS tosolve the capacitated vehicle routing problem [46] Prasadand Mukherjee have used SOS for optimal power flow ofpower system with FACTS devices [47] Abdullahi et alproposed SOS-based task scheduling in cloud computingenvironment [48] Verma et al investigated SOS for conges-tion management in deregulated environment [49] Time-cost-labor utilization tradeoff problem was solved by Tran etal using this algorithm [50] Recently in 2016more andmorescholars get interested in the research of the SOS algorithmYu et al applied two solution representations to transformSOS into an applicable solution approach for the capacitatedvehicle and then apply a local search strategy to improve

the solution quality of SOS [51] Panda and Pani presentedhybrid SOS algorithm with adaptive penalty function tosolve multiobjective constrained optimization problems [52]Banerjee and Chattopadhyay presented a novelmodified SOSto design an improved three-dimensional turbo code [53]Das et al used SOS to determine the optimal size and locationof distributed generation (DG) in radial distribution network(RDN) for the reduction of network loss [54] Dosoglu etal utilized SOS for economicemission dispatch problem inpower systems [55]

The structure of this paper is organized as followsSection 2 gives a brief description of feedforward neuralnetwork Section 3 elaborates the symbiotic organisms Searchand Section 4 describes the SOS-based trainer and how it canbe used for training FNNs in detail In Section 5 series ofcomparison experiments are conducted our conclusion willbe given in Section 6

2 Feedforward Neural Network

In the artificial neural network the feedforward neuralnetwork (FNN) was the simplest type which consists of a setof processing elements called ldquoneuronsrdquo [33] In this networkthe information moves in only one direction forward fromthe input layer through the hidden layer and to the outputlayerThere are no cycles or loops in the network An exampleof a simple FNN with a single hidden layer is shown inFigure 1 As shown each neuron computes the sum of theinputs weight at the presence of a bias and passes this sumthrough an activation function (like sigmoid function) so thatthe output is obtained This process can be expressed as (1)and (2)

ℎ119895 = 119877sum119894=1

iw119895119894119909119894 + hb119895 (1)

where iw119895119894 is the weight connected between neurons 119894 =(1 2 119877) and 119895 = (1 2 119873) hb119895 is a bias in hidden layer119877 is the total number of neurons in input layer and 119909119894 is thecorresponding input data

Here the S-shaped curved sigmoid function is used as theactivation function which is shown in

119891(119909) = 11 + 119890minus119909 (2)

Therefore the output of the neuron in hidden layer canbe described as in

ho119895 = 119891119895 (ℎ119895) = 1(1 + 119890minusℎ119895) (3)

In the output layer the output of the neuron is shown in

119910119896 = 119891119896( 119873sum119895=1

hw119896119895ho119895 + ob119896) (4)

where hw119895119894 is the weight connected between neurons 119895 =(1 2 119873) and 119896 = (1 2 119878) ob119896 is a bias in output layer

Computational Intelligence and Neuroscience 3

Input layer Output layerHidden layer

x1

x2

xR

iw11

iw12 iw1R

iwN1iwN2

iwNR

hb1

ho1

hb2

hb3

hbN

hoN

hw11

hw12

hw13

hw1N

hwS1

hwS3

hwSN

ob1

obS

y1

yS

Figure 1 A feedforward network with one hidden layer

119873 is the total number of neurons in hidden layer and 119878 is thetotal number of neurons in output layer

The training process is carried out to adjust the weightsand bias until some error criterion is met Above all oneproblem is to select a proper training algorithm Also it isvery complex to design the neural network because manyelements affect the performance of training such as thenumber of neurons in hidden layer interconnection betweenneurons and layer error function and activation function

3 Symbiotic Organisms Search Algorithm

Symbiotic organisms search [44] stimulates symbiotic inter-action relationship that organisms use to survive in theecosystem Three phases mutualism phase commensalismphase and parasitism phase stimulate the real-world biolog-ical interaction between two organisms in ecosystem

31 Mutualism Phase Organisms engage in a mutuality rela-tionship with the goal of increasing mutual survival advan-tage in the ecosystem New candidate organisms for 119883119894 and119883119895 are calculated based on the mutuality symbiosis betweenorganism119883119894 and119883119895 which is modeled in (5) and (6)

119883119894new = 119883119894+ 120572 (119883best minusMutual Vector lowast BF1) (5)

119883119895new = 119883119895+ 120573 (119883best minusMutual Vector lowast BF2) (6)

Mutual Vector = (119883119894 + 119883119895)2 (7)

where BF1 and BF2 are benefit factors that are determinedrandomly as either 1 or 2 These factors represent partially orfully level of benefit to each organism 119883best represents thehighest degree of adaptation organism 120572 and 120573 are randomnumber in [0 1] In (7) a vector called ldquoMutual Vectorrdquorepresents the relationship characteristic between organisms119883119894 and11988311989532 Commensalism Phase One organism obtains benefit anddoes not impact the other in commensalism phase Organism119883119895 represents the one that neither benefits nor suffers fromthe relationship and the new candidate organism of119883119894 is cal-culated according to the commensalism symbiosis betweenorganisms119883119894 and119883119895 which is modeled in

119883119894new = 119883119894 + 120575 (119883best minus 119883119895) (8)

where 120575 represents a random number in [minus1 1] And119883best isthe highest degree of adaptation organism

33 Parasitism Phase One organism gains benefit butactively harms the other in the parasitism phase An artificialparasite called ldquoParasite Vectorrdquo is created in the search spaceby duplicating organism119883119894 and thenmodifying the randomlyselected dimensions using a randomnumber Parasite Vectortries to replace another organism 119883119895 in the ecosystemAccording to Darwinrsquos evolution theory ldquoonly the fittestorganisms will prevailrdquo if Parasite Vector is better it willkill organism 119883119895 and assume its position else 119883119895 will haveimmunity from the parasite and the Parasite Vector will nolonger be able to live in that ecosystem

4 Computational Intelligence and Neuroscience

x1 middot middot middot x(NlowastR) x(NlowastR+1) middot middot middot x(NlowastR+SlowastR) x(NlowastR+SlowastR+1) middot middot middot x(NlowastR+SlowastR+N) x(NlowastR+SlowastR+N+1) middot middot middot x(NlowastR+SlowastR+N+S)

Input weights (iw) Hidden weights (hw) Hidden biases (hb) Output biases (ob)

Figure 2 The vector of training parameters

4 SOS for Train FNNs

In this paper symbiotic organisms search is used as a newmethod to train FNNs The set of weights and bias issimultaneously determined by SOS in order to minimize theoverall error of one FNN and its corresponding accuracy bytraining the network This means that the structure of theFNN is fixed Figure 3 shows the flowchart of trainingmethodSOS which is started by collecting normalizing and readinga dataset Once a network has been structured for a particularapplication including setting the desired number of neuronsin each layer it is ready for training

41 The Feedforward Neural Networks Architecture Whenimplementing a neural network it is necessary to determinethe structure based on the number of layers and the numberof neurons in the layers The larger the number of hiddenlayers and nodes the more complex the network will be Inthis work the number of input and output neurons in MLPnetwork is problem-dependent and the number of hiddennodes is computed on the basis of Kolmogorov theorem [56]Hidden = 2 times Input + 1 When using SOS to optimize theweights and bias in network the dimension of each organismis considered as119863 shown in

119863 = (Input timesHidden) + (Hidden timesOutput)+Hiddenbias +Outputbias (9)

where Input Hidden and Output refer to the number ofinput hidden and output neurons of FNN respectively AlsoHiddenbias and Outputbias are the number of biases in hiddenand output layers

42 Fitness Function In SOS every organism is evaluatedaccording to its status (fitness) This evaluation is done bypassing the vector of weights and biases to FNNs then themean squared error (MSE) criterion is calculated based on theprediction of the neural network using the training datasetThrough continuous iterations the optimal solution is finallyachieved which is regarded as the weights and biases of aneural network The MSE criterion is given in (10) where119910 and are the actual and the estimated values based onproposed model and 119877 is the number of samples in thetraining dataset

MSE = 1119877119877sum119894=1

(119910 minus )2 (10)

43 Encoding Strategy According to [57] the weights andbiases of FNNs for every agent in evolutionary algorithms canbe encoded and represented in the form of vector matrix orbinary In this work the vector encoding method is utilizedAn example of this encoding strategy for FNN is provided asshown in Figure 2

During the initialization process 119883 = (1198831 1198832 119883119873)is set on behalf of the N organisms Each organism 119883119894 =iw hw hb ob (119894 = 1 2 119873) represents complete set ofFNN weights and biases which is converted into a singlevector of real number

44 Criteria for Evaluating Performance Classification isused to understand the existing data and to predict howunseen data will behave In other words the objective ofdata classification is to classify the unseen data in differentclasses on the basis of studying the existing data For theclassification problem in addition toMSE criterion accuracyrate was usedThis ratemeasures the ability of the classifier byproducing accurate results which can be computed as follows

Accuracy = 119873 (11)

where represents the number of correctly classified objectsby the classifier and119873 is the number of objects in the dataset

5 Simulation Experiments

This section presents a comprehensive analysis to investigatethe efficiency of the SOS algorithm for training FNNsAs shown in Table 1 eight datasets are selected from UCImachine learning repository [58] to evaluate the performanceof SOS And six metaheuristic algorithms including BBO[43] CS [37] GA [23] GSA [27 28] PSO [28] and MVO[33] are presented for a reliable comparison

51 Datasets Design The Blood dataset contains 748 instan-ces which were selected randomly from the donor databaseof Blood Transfusion Service Center in Hsinchu City inTaiwan As a binary classification problem the output classvariable of the dataset represents whether the person donatedblood in a time period (1 stands for donating blood 0 standsfor not donating blood) And the input variables are Recencymonths since last donation Frequency total number ofdonation Monetary total blood donated in cc and Timemonths since first donation [59]

Computational Intelligence and Neuroscience 5

(1) Read in datasets

calculate fitness values of the modified organisms

Are the modified organismsfitter than the previous

Keep the previous organisms Accept the modified organisms

Is the modified organism fitter than theprevious

Keep the previous organism Accept the modified organism

Are termination criteriaachieved

Optimal solution

Yes

No

No

No

Yes

Yes

Yes

No

(4) Mutualism phase

(5) Commensalism phase

(6) Parasitism phase

(2) Ecosystem initialization

Number of organisms termination criteria number of trainingand testing examples number of input and output neurons and

calculate the number of hidden neurons based on inputneurons initial organisms in ecosystem

(3) Identify best organism Xbest

Randomly select one organism Xj where Xj Xi

Determine mutual relationship vector (mutual_vector) and benefit factor (BF)

Modify organisms Xi and Xj based on their mutual relationship

Randomly select one organism Xj where Xj Xi

Modify organism Xj with the help of Xi and calculate fitness value of the modified organism

Randomly select one organism Xj where Xj Xi

Create a parasite (parasite_vector) from organism Xi and calculate its fitness value

Keep Xj and remove parasite_vector Replace Xj with parasite_vector

fitter than XjIs parasite vector

Figure 3 Flowchart of SOS algorithm

6 Computational Intelligence and Neuroscience

Table 1 Description of datasets

Dataset Attribute Class Training sample Testing sample Input Hidden OutputBlood 4 2 493 255 4 9 2Balance Scale 4 3 412 213 4 9 3Habermanrsquos Survival 3 2 202 104 3 7 2Liver Disorders 6 2 227 118 6 13 2Seeds 7 3 139 71 7 15 3Wine 13 3 117 61 13 27 3Iris 4 3 99 51 4 9 3Statlog (Heart) 13 2 178 92 13 27 2

The Balance Scale dataset is generated to model psycho-logical experiments reported by Siegler [60] This datasetcontains 625 examples and each example is classified ashaving the balance scale tip to the right and tip to the leftor being balanced The attributes are the left weight the leftdistance the right weight and the right distance The correctway to find the class is the greater of (left distance lowast leftweight) and (right distance lowast right weight) If they are equalit is balanced

Habermanrsquos Survival dataset contains cases from a studythat was conducted between 1958 and 1970 at the Universityof Chicagorsquos Billings Hospital on the survival of patients whohad undergone surgery for breast cancerThedataset contains306 cases which record two survival status patients with ageof patient at time of operation patientrsquos year of operation andnumber of positive axillary nodes detected

The Liver Disorders dataset was donated by BUPAMedi-cal Research Ltd to record the liver disorder status in termsof a binary label The dataset includes values of 6 featuresmeasured for 345 male individuals The first 5 features areall blood tests which are thought to be sensitive to liverdisorders that might arise from excessive alcohol consump-tion These features are Mean Corpuscular Volume (MCV)alkaline phosphatase (ALKPHOS) alanine aminotransferase(SGPT) aspartate aminotransferase (SGOT) and gamma-glutamyl transpeptidase (GAMMAGT) The sixth feature isthe number of alcoholic beverage drinks per day (DRINKS)

The Seeds dataset consists of 210 patterns belonging tothree different varieties of wheat Kama Rosa and CanadianFrom each species there are 70 observations for area 119860perimeter 119875 compactness 119862 (119862 = 4 lowast 119901119894 lowast 1198601198752) length ofkernel width of kernel asymmetry coefficient and length ofkernel groove

The Wine dataset contains 178 instances recording theresults of a chemical analysis of wines grown in the sameregion in Italy but derived from three different cultivars Theanalysis determined the quantities of 13 constituents found ineach of the three types of wines

The Iris dataset contains 3 species of 50 instances eachwhere each species refers to a type of Iris plant (setosaversicolor and virginica) One species is linearly separablefrom the other 2 and the latter are not linearly separablefrom each other Each of the 3 species is classified by threeattributes sepal length sepal width petal length and petalwidth in cm This dataset was used by Fisher [61] in hisinitiation of the linear-discriminate-function technique

The Statlog (Heart) dataset is a heart disease databasecontaining 270 instances that consist of 13 attributes agesex chest pain type (4 values) resting blood pressure serumcholesterol inmgdL fasting blood sugargt 120mgdL restingelectrocardiographic results (values 0 1 and 2) maximumheart rate achieved exercise induced angina oldpeak = STdepression induced by exercise relative to rest the slope ofthe peak exercise ST segment number of major vessels (0ndash3)colored by fluoroscopy and thal 3 = normal 6 = fixed defect7 = reversible defect

52 Experimental Setup In this section the experimentswere done using a desktop computer with a 330GHz Intel(R)Core(TM) i5 processor 4GB of memory The entire algo-rithm was programmed inMATLAB R2012aThementioneddatasets are partitioned into 66 for training and 34 fortesting [33] All experiments are executed for 20 differentruns and each run includes 500 iterations The populationsize is considered as 30 and other control parameters of thecorresponding algorithms are given below

In CS the possibility of eggs being detected andthrown out of the nest is 119901119886 = 025In GA crossover rate 119875119862 = 05 mutate rate 119875119888 = 005In PSO the parameters are set to 1198621 = 1198622 = 2 weightfactor 119908 decreased linearly from 09 to 05

In BBO mutation probability 119901119898 = 01 the value forboth max immigration (119868) and max emigration (119864) is1 and the habitat modification probability 119901ℎ = 08In MVO exploitation accuracy is set to 119901 = 6 themin traveling distance rate is set to 02 and the maxtraveling distance rate is set to 1

In GSA 120572 is set to 20 the gravitational constant (1198660)is set to 1 and initial values of acceleration and massare set to 0 for each particle

All input features are mapped onto the interval of [minus1 1]for a small scale Here we apply min-max normalization toperform a linear transformation on the original data as givenin (12) where V1015840 is the normalized value of V in the range[minmax]

V1015840 = 2 lowast V minusminmaxminusmin

minus 1 (12)

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 3: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Computational Intelligence and Neuroscience 3

Input layer Output layerHidden layer

x1

x2

xR

iw11

iw12 iw1R

iwN1iwN2

iwNR

hb1

ho1

hb2

hb3

hbN

hoN

hw11

hw12

hw13

hw1N

hwS1

hwS3

hwSN

ob1

obS

y1

yS

Figure 1 A feedforward network with one hidden layer

119873 is the total number of neurons in hidden layer and 119878 is thetotal number of neurons in output layer

The training process is carried out to adjust the weightsand bias until some error criterion is met Above all oneproblem is to select a proper training algorithm Also it isvery complex to design the neural network because manyelements affect the performance of training such as thenumber of neurons in hidden layer interconnection betweenneurons and layer error function and activation function

3 Symbiotic Organisms Search Algorithm

Symbiotic organisms search [44] stimulates symbiotic inter-action relationship that organisms use to survive in theecosystem Three phases mutualism phase commensalismphase and parasitism phase stimulate the real-world biolog-ical interaction between two organisms in ecosystem

31 Mutualism Phase Organisms engage in a mutuality rela-tionship with the goal of increasing mutual survival advan-tage in the ecosystem New candidate organisms for 119883119894 and119883119895 are calculated based on the mutuality symbiosis betweenorganism119883119894 and119883119895 which is modeled in (5) and (6)

119883119894new = 119883119894+ 120572 (119883best minusMutual Vector lowast BF1) (5)

119883119895new = 119883119895+ 120573 (119883best minusMutual Vector lowast BF2) (6)

Mutual Vector = (119883119894 + 119883119895)2 (7)

where BF1 and BF2 are benefit factors that are determinedrandomly as either 1 or 2 These factors represent partially orfully level of benefit to each organism 119883best represents thehighest degree of adaptation organism 120572 and 120573 are randomnumber in [0 1] In (7) a vector called ldquoMutual Vectorrdquorepresents the relationship characteristic between organisms119883119894 and11988311989532 Commensalism Phase One organism obtains benefit anddoes not impact the other in commensalism phase Organism119883119895 represents the one that neither benefits nor suffers fromthe relationship and the new candidate organism of119883119894 is cal-culated according to the commensalism symbiosis betweenorganisms119883119894 and119883119895 which is modeled in

119883119894new = 119883119894 + 120575 (119883best minus 119883119895) (8)

where 120575 represents a random number in [minus1 1] And119883best isthe highest degree of adaptation organism

33 Parasitism Phase One organism gains benefit butactively harms the other in the parasitism phase An artificialparasite called ldquoParasite Vectorrdquo is created in the search spaceby duplicating organism119883119894 and thenmodifying the randomlyselected dimensions using a randomnumber Parasite Vectortries to replace another organism 119883119895 in the ecosystemAccording to Darwinrsquos evolution theory ldquoonly the fittestorganisms will prevailrdquo if Parasite Vector is better it willkill organism 119883119895 and assume its position else 119883119895 will haveimmunity from the parasite and the Parasite Vector will nolonger be able to live in that ecosystem

4 Computational Intelligence and Neuroscience

x1 middot middot middot x(NlowastR) x(NlowastR+1) middot middot middot x(NlowastR+SlowastR) x(NlowastR+SlowastR+1) middot middot middot x(NlowastR+SlowastR+N) x(NlowastR+SlowastR+N+1) middot middot middot x(NlowastR+SlowastR+N+S)

Input weights (iw) Hidden weights (hw) Hidden biases (hb) Output biases (ob)

Figure 2 The vector of training parameters

4 SOS for Train FNNs

In this paper symbiotic organisms search is used as a newmethod to train FNNs The set of weights and bias issimultaneously determined by SOS in order to minimize theoverall error of one FNN and its corresponding accuracy bytraining the network This means that the structure of theFNN is fixed Figure 3 shows the flowchart of trainingmethodSOS which is started by collecting normalizing and readinga dataset Once a network has been structured for a particularapplication including setting the desired number of neuronsin each layer it is ready for training

41 The Feedforward Neural Networks Architecture Whenimplementing a neural network it is necessary to determinethe structure based on the number of layers and the numberof neurons in the layers The larger the number of hiddenlayers and nodes the more complex the network will be Inthis work the number of input and output neurons in MLPnetwork is problem-dependent and the number of hiddennodes is computed on the basis of Kolmogorov theorem [56]Hidden = 2 times Input + 1 When using SOS to optimize theweights and bias in network the dimension of each organismis considered as119863 shown in

119863 = (Input timesHidden) + (Hidden timesOutput)+Hiddenbias +Outputbias (9)

where Input Hidden and Output refer to the number ofinput hidden and output neurons of FNN respectively AlsoHiddenbias and Outputbias are the number of biases in hiddenand output layers

42 Fitness Function In SOS every organism is evaluatedaccording to its status (fitness) This evaluation is done bypassing the vector of weights and biases to FNNs then themean squared error (MSE) criterion is calculated based on theprediction of the neural network using the training datasetThrough continuous iterations the optimal solution is finallyachieved which is regarded as the weights and biases of aneural network The MSE criterion is given in (10) where119910 and are the actual and the estimated values based onproposed model and 119877 is the number of samples in thetraining dataset

MSE = 1119877119877sum119894=1

(119910 minus )2 (10)

43 Encoding Strategy According to [57] the weights andbiases of FNNs for every agent in evolutionary algorithms canbe encoded and represented in the form of vector matrix orbinary In this work the vector encoding method is utilizedAn example of this encoding strategy for FNN is provided asshown in Figure 2

During the initialization process 119883 = (1198831 1198832 119883119873)is set on behalf of the N organisms Each organism 119883119894 =iw hw hb ob (119894 = 1 2 119873) represents complete set ofFNN weights and biases which is converted into a singlevector of real number

44 Criteria for Evaluating Performance Classification isused to understand the existing data and to predict howunseen data will behave In other words the objective ofdata classification is to classify the unseen data in differentclasses on the basis of studying the existing data For theclassification problem in addition toMSE criterion accuracyrate was usedThis ratemeasures the ability of the classifier byproducing accurate results which can be computed as follows

Accuracy = 119873 (11)

where represents the number of correctly classified objectsby the classifier and119873 is the number of objects in the dataset

5 Simulation Experiments

This section presents a comprehensive analysis to investigatethe efficiency of the SOS algorithm for training FNNsAs shown in Table 1 eight datasets are selected from UCImachine learning repository [58] to evaluate the performanceof SOS And six metaheuristic algorithms including BBO[43] CS [37] GA [23] GSA [27 28] PSO [28] and MVO[33] are presented for a reliable comparison

51 Datasets Design The Blood dataset contains 748 instan-ces which were selected randomly from the donor databaseof Blood Transfusion Service Center in Hsinchu City inTaiwan As a binary classification problem the output classvariable of the dataset represents whether the person donatedblood in a time period (1 stands for donating blood 0 standsfor not donating blood) And the input variables are Recencymonths since last donation Frequency total number ofdonation Monetary total blood donated in cc and Timemonths since first donation [59]

Computational Intelligence and Neuroscience 5

(1) Read in datasets

calculate fitness values of the modified organisms

Are the modified organismsfitter than the previous

Keep the previous organisms Accept the modified organisms

Is the modified organism fitter than theprevious

Keep the previous organism Accept the modified organism

Are termination criteriaachieved

Optimal solution

Yes

No

No

No

Yes

Yes

Yes

No

(4) Mutualism phase

(5) Commensalism phase

(6) Parasitism phase

(2) Ecosystem initialization

Number of organisms termination criteria number of trainingand testing examples number of input and output neurons and

calculate the number of hidden neurons based on inputneurons initial organisms in ecosystem

(3) Identify best organism Xbest

Randomly select one organism Xj where Xj Xi

Determine mutual relationship vector (mutual_vector) and benefit factor (BF)

Modify organisms Xi and Xj based on their mutual relationship

Randomly select one organism Xj where Xj Xi

Modify organism Xj with the help of Xi and calculate fitness value of the modified organism

Randomly select one organism Xj where Xj Xi

Create a parasite (parasite_vector) from organism Xi and calculate its fitness value

Keep Xj and remove parasite_vector Replace Xj with parasite_vector

fitter than XjIs parasite vector

Figure 3 Flowchart of SOS algorithm

6 Computational Intelligence and Neuroscience

Table 1 Description of datasets

Dataset Attribute Class Training sample Testing sample Input Hidden OutputBlood 4 2 493 255 4 9 2Balance Scale 4 3 412 213 4 9 3Habermanrsquos Survival 3 2 202 104 3 7 2Liver Disorders 6 2 227 118 6 13 2Seeds 7 3 139 71 7 15 3Wine 13 3 117 61 13 27 3Iris 4 3 99 51 4 9 3Statlog (Heart) 13 2 178 92 13 27 2

The Balance Scale dataset is generated to model psycho-logical experiments reported by Siegler [60] This datasetcontains 625 examples and each example is classified ashaving the balance scale tip to the right and tip to the leftor being balanced The attributes are the left weight the leftdistance the right weight and the right distance The correctway to find the class is the greater of (left distance lowast leftweight) and (right distance lowast right weight) If they are equalit is balanced

Habermanrsquos Survival dataset contains cases from a studythat was conducted between 1958 and 1970 at the Universityof Chicagorsquos Billings Hospital on the survival of patients whohad undergone surgery for breast cancerThedataset contains306 cases which record two survival status patients with ageof patient at time of operation patientrsquos year of operation andnumber of positive axillary nodes detected

The Liver Disorders dataset was donated by BUPAMedi-cal Research Ltd to record the liver disorder status in termsof a binary label The dataset includes values of 6 featuresmeasured for 345 male individuals The first 5 features areall blood tests which are thought to be sensitive to liverdisorders that might arise from excessive alcohol consump-tion These features are Mean Corpuscular Volume (MCV)alkaline phosphatase (ALKPHOS) alanine aminotransferase(SGPT) aspartate aminotransferase (SGOT) and gamma-glutamyl transpeptidase (GAMMAGT) The sixth feature isthe number of alcoholic beverage drinks per day (DRINKS)

The Seeds dataset consists of 210 patterns belonging tothree different varieties of wheat Kama Rosa and CanadianFrom each species there are 70 observations for area 119860perimeter 119875 compactness 119862 (119862 = 4 lowast 119901119894 lowast 1198601198752) length ofkernel width of kernel asymmetry coefficient and length ofkernel groove

The Wine dataset contains 178 instances recording theresults of a chemical analysis of wines grown in the sameregion in Italy but derived from three different cultivars Theanalysis determined the quantities of 13 constituents found ineach of the three types of wines

The Iris dataset contains 3 species of 50 instances eachwhere each species refers to a type of Iris plant (setosaversicolor and virginica) One species is linearly separablefrom the other 2 and the latter are not linearly separablefrom each other Each of the 3 species is classified by threeattributes sepal length sepal width petal length and petalwidth in cm This dataset was used by Fisher [61] in hisinitiation of the linear-discriminate-function technique

The Statlog (Heart) dataset is a heart disease databasecontaining 270 instances that consist of 13 attributes agesex chest pain type (4 values) resting blood pressure serumcholesterol inmgdL fasting blood sugargt 120mgdL restingelectrocardiographic results (values 0 1 and 2) maximumheart rate achieved exercise induced angina oldpeak = STdepression induced by exercise relative to rest the slope ofthe peak exercise ST segment number of major vessels (0ndash3)colored by fluoroscopy and thal 3 = normal 6 = fixed defect7 = reversible defect

52 Experimental Setup In this section the experimentswere done using a desktop computer with a 330GHz Intel(R)Core(TM) i5 processor 4GB of memory The entire algo-rithm was programmed inMATLAB R2012aThementioneddatasets are partitioned into 66 for training and 34 fortesting [33] All experiments are executed for 20 differentruns and each run includes 500 iterations The populationsize is considered as 30 and other control parameters of thecorresponding algorithms are given below

In CS the possibility of eggs being detected andthrown out of the nest is 119901119886 = 025In GA crossover rate 119875119862 = 05 mutate rate 119875119888 = 005In PSO the parameters are set to 1198621 = 1198622 = 2 weightfactor 119908 decreased linearly from 09 to 05

In BBO mutation probability 119901119898 = 01 the value forboth max immigration (119868) and max emigration (119864) is1 and the habitat modification probability 119901ℎ = 08In MVO exploitation accuracy is set to 119901 = 6 themin traveling distance rate is set to 02 and the maxtraveling distance rate is set to 1

In GSA 120572 is set to 20 the gravitational constant (1198660)is set to 1 and initial values of acceleration and massare set to 0 for each particle

All input features are mapped onto the interval of [minus1 1]for a small scale Here we apply min-max normalization toperform a linear transformation on the original data as givenin (12) where V1015840 is the normalized value of V in the range[minmax]

V1015840 = 2 lowast V minusminmaxminusmin

minus 1 (12)

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 4: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

4 Computational Intelligence and Neuroscience

x1 middot middot middot x(NlowastR) x(NlowastR+1) middot middot middot x(NlowastR+SlowastR) x(NlowastR+SlowastR+1) middot middot middot x(NlowastR+SlowastR+N) x(NlowastR+SlowastR+N+1) middot middot middot x(NlowastR+SlowastR+N+S)

Input weights (iw) Hidden weights (hw) Hidden biases (hb) Output biases (ob)

Figure 2 The vector of training parameters

4 SOS for Train FNNs

In this paper symbiotic organisms search is used as a newmethod to train FNNs The set of weights and bias issimultaneously determined by SOS in order to minimize theoverall error of one FNN and its corresponding accuracy bytraining the network This means that the structure of theFNN is fixed Figure 3 shows the flowchart of trainingmethodSOS which is started by collecting normalizing and readinga dataset Once a network has been structured for a particularapplication including setting the desired number of neuronsin each layer it is ready for training

41 The Feedforward Neural Networks Architecture Whenimplementing a neural network it is necessary to determinethe structure based on the number of layers and the numberof neurons in the layers The larger the number of hiddenlayers and nodes the more complex the network will be Inthis work the number of input and output neurons in MLPnetwork is problem-dependent and the number of hiddennodes is computed on the basis of Kolmogorov theorem [56]Hidden = 2 times Input + 1 When using SOS to optimize theweights and bias in network the dimension of each organismis considered as119863 shown in

119863 = (Input timesHidden) + (Hidden timesOutput)+Hiddenbias +Outputbias (9)

where Input Hidden and Output refer to the number ofinput hidden and output neurons of FNN respectively AlsoHiddenbias and Outputbias are the number of biases in hiddenand output layers

42 Fitness Function In SOS every organism is evaluatedaccording to its status (fitness) This evaluation is done bypassing the vector of weights and biases to FNNs then themean squared error (MSE) criterion is calculated based on theprediction of the neural network using the training datasetThrough continuous iterations the optimal solution is finallyachieved which is regarded as the weights and biases of aneural network The MSE criterion is given in (10) where119910 and are the actual and the estimated values based onproposed model and 119877 is the number of samples in thetraining dataset

MSE = 1119877119877sum119894=1

(119910 minus )2 (10)

43 Encoding Strategy According to [57] the weights andbiases of FNNs for every agent in evolutionary algorithms canbe encoded and represented in the form of vector matrix orbinary In this work the vector encoding method is utilizedAn example of this encoding strategy for FNN is provided asshown in Figure 2

During the initialization process 119883 = (1198831 1198832 119883119873)is set on behalf of the N organisms Each organism 119883119894 =iw hw hb ob (119894 = 1 2 119873) represents complete set ofFNN weights and biases which is converted into a singlevector of real number

44 Criteria for Evaluating Performance Classification isused to understand the existing data and to predict howunseen data will behave In other words the objective ofdata classification is to classify the unseen data in differentclasses on the basis of studying the existing data For theclassification problem in addition toMSE criterion accuracyrate was usedThis ratemeasures the ability of the classifier byproducing accurate results which can be computed as follows

Accuracy = 119873 (11)

where represents the number of correctly classified objectsby the classifier and119873 is the number of objects in the dataset

5 Simulation Experiments

This section presents a comprehensive analysis to investigatethe efficiency of the SOS algorithm for training FNNsAs shown in Table 1 eight datasets are selected from UCImachine learning repository [58] to evaluate the performanceof SOS And six metaheuristic algorithms including BBO[43] CS [37] GA [23] GSA [27 28] PSO [28] and MVO[33] are presented for a reliable comparison

51 Datasets Design The Blood dataset contains 748 instan-ces which were selected randomly from the donor databaseof Blood Transfusion Service Center in Hsinchu City inTaiwan As a binary classification problem the output classvariable of the dataset represents whether the person donatedblood in a time period (1 stands for donating blood 0 standsfor not donating blood) And the input variables are Recencymonths since last donation Frequency total number ofdonation Monetary total blood donated in cc and Timemonths since first donation [59]

Computational Intelligence and Neuroscience 5

(1) Read in datasets

calculate fitness values of the modified organisms

Are the modified organismsfitter than the previous

Keep the previous organisms Accept the modified organisms

Is the modified organism fitter than theprevious

Keep the previous organism Accept the modified organism

Are termination criteriaachieved

Optimal solution

Yes

No

No

No

Yes

Yes

Yes

No

(4) Mutualism phase

(5) Commensalism phase

(6) Parasitism phase

(2) Ecosystem initialization

Number of organisms termination criteria number of trainingand testing examples number of input and output neurons and

calculate the number of hidden neurons based on inputneurons initial organisms in ecosystem

(3) Identify best organism Xbest

Randomly select one organism Xj where Xj Xi

Determine mutual relationship vector (mutual_vector) and benefit factor (BF)

Modify organisms Xi and Xj based on their mutual relationship

Randomly select one organism Xj where Xj Xi

Modify organism Xj with the help of Xi and calculate fitness value of the modified organism

Randomly select one organism Xj where Xj Xi

Create a parasite (parasite_vector) from organism Xi and calculate its fitness value

Keep Xj and remove parasite_vector Replace Xj with parasite_vector

fitter than XjIs parasite vector

Figure 3 Flowchart of SOS algorithm

6 Computational Intelligence and Neuroscience

Table 1 Description of datasets

Dataset Attribute Class Training sample Testing sample Input Hidden OutputBlood 4 2 493 255 4 9 2Balance Scale 4 3 412 213 4 9 3Habermanrsquos Survival 3 2 202 104 3 7 2Liver Disorders 6 2 227 118 6 13 2Seeds 7 3 139 71 7 15 3Wine 13 3 117 61 13 27 3Iris 4 3 99 51 4 9 3Statlog (Heart) 13 2 178 92 13 27 2

The Balance Scale dataset is generated to model psycho-logical experiments reported by Siegler [60] This datasetcontains 625 examples and each example is classified ashaving the balance scale tip to the right and tip to the leftor being balanced The attributes are the left weight the leftdistance the right weight and the right distance The correctway to find the class is the greater of (left distance lowast leftweight) and (right distance lowast right weight) If they are equalit is balanced

Habermanrsquos Survival dataset contains cases from a studythat was conducted between 1958 and 1970 at the Universityof Chicagorsquos Billings Hospital on the survival of patients whohad undergone surgery for breast cancerThedataset contains306 cases which record two survival status patients with ageof patient at time of operation patientrsquos year of operation andnumber of positive axillary nodes detected

The Liver Disorders dataset was donated by BUPAMedi-cal Research Ltd to record the liver disorder status in termsof a binary label The dataset includes values of 6 featuresmeasured for 345 male individuals The first 5 features areall blood tests which are thought to be sensitive to liverdisorders that might arise from excessive alcohol consump-tion These features are Mean Corpuscular Volume (MCV)alkaline phosphatase (ALKPHOS) alanine aminotransferase(SGPT) aspartate aminotransferase (SGOT) and gamma-glutamyl transpeptidase (GAMMAGT) The sixth feature isthe number of alcoholic beverage drinks per day (DRINKS)

The Seeds dataset consists of 210 patterns belonging tothree different varieties of wheat Kama Rosa and CanadianFrom each species there are 70 observations for area 119860perimeter 119875 compactness 119862 (119862 = 4 lowast 119901119894 lowast 1198601198752) length ofkernel width of kernel asymmetry coefficient and length ofkernel groove

The Wine dataset contains 178 instances recording theresults of a chemical analysis of wines grown in the sameregion in Italy but derived from three different cultivars Theanalysis determined the quantities of 13 constituents found ineach of the three types of wines

The Iris dataset contains 3 species of 50 instances eachwhere each species refers to a type of Iris plant (setosaversicolor and virginica) One species is linearly separablefrom the other 2 and the latter are not linearly separablefrom each other Each of the 3 species is classified by threeattributes sepal length sepal width petal length and petalwidth in cm This dataset was used by Fisher [61] in hisinitiation of the linear-discriminate-function technique

The Statlog (Heart) dataset is a heart disease databasecontaining 270 instances that consist of 13 attributes agesex chest pain type (4 values) resting blood pressure serumcholesterol inmgdL fasting blood sugargt 120mgdL restingelectrocardiographic results (values 0 1 and 2) maximumheart rate achieved exercise induced angina oldpeak = STdepression induced by exercise relative to rest the slope ofthe peak exercise ST segment number of major vessels (0ndash3)colored by fluoroscopy and thal 3 = normal 6 = fixed defect7 = reversible defect

52 Experimental Setup In this section the experimentswere done using a desktop computer with a 330GHz Intel(R)Core(TM) i5 processor 4GB of memory The entire algo-rithm was programmed inMATLAB R2012aThementioneddatasets are partitioned into 66 for training and 34 fortesting [33] All experiments are executed for 20 differentruns and each run includes 500 iterations The populationsize is considered as 30 and other control parameters of thecorresponding algorithms are given below

In CS the possibility of eggs being detected andthrown out of the nest is 119901119886 = 025In GA crossover rate 119875119862 = 05 mutate rate 119875119888 = 005In PSO the parameters are set to 1198621 = 1198622 = 2 weightfactor 119908 decreased linearly from 09 to 05

In BBO mutation probability 119901119898 = 01 the value forboth max immigration (119868) and max emigration (119864) is1 and the habitat modification probability 119901ℎ = 08In MVO exploitation accuracy is set to 119901 = 6 themin traveling distance rate is set to 02 and the maxtraveling distance rate is set to 1

In GSA 120572 is set to 20 the gravitational constant (1198660)is set to 1 and initial values of acceleration and massare set to 0 for each particle

All input features are mapped onto the interval of [minus1 1]for a small scale Here we apply min-max normalization toperform a linear transformation on the original data as givenin (12) where V1015840 is the normalized value of V in the range[minmax]

V1015840 = 2 lowast V minusminmaxminusmin

minus 1 (12)

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 5: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Computational Intelligence and Neuroscience 5

(1) Read in datasets

calculate fitness values of the modified organisms

Are the modified organismsfitter than the previous

Keep the previous organisms Accept the modified organisms

Is the modified organism fitter than theprevious

Keep the previous organism Accept the modified organism

Are termination criteriaachieved

Optimal solution

Yes

No

No

No

Yes

Yes

Yes

No

(4) Mutualism phase

(5) Commensalism phase

(6) Parasitism phase

(2) Ecosystem initialization

Number of organisms termination criteria number of trainingand testing examples number of input and output neurons and

calculate the number of hidden neurons based on inputneurons initial organisms in ecosystem

(3) Identify best organism Xbest

Randomly select one organism Xj where Xj Xi

Determine mutual relationship vector (mutual_vector) and benefit factor (BF)

Modify organisms Xi and Xj based on their mutual relationship

Randomly select one organism Xj where Xj Xi

Modify organism Xj with the help of Xi and calculate fitness value of the modified organism

Randomly select one organism Xj where Xj Xi

Create a parasite (parasite_vector) from organism Xi and calculate its fitness value

Keep Xj and remove parasite_vector Replace Xj with parasite_vector

fitter than XjIs parasite vector

Figure 3 Flowchart of SOS algorithm

6 Computational Intelligence and Neuroscience

Table 1 Description of datasets

Dataset Attribute Class Training sample Testing sample Input Hidden OutputBlood 4 2 493 255 4 9 2Balance Scale 4 3 412 213 4 9 3Habermanrsquos Survival 3 2 202 104 3 7 2Liver Disorders 6 2 227 118 6 13 2Seeds 7 3 139 71 7 15 3Wine 13 3 117 61 13 27 3Iris 4 3 99 51 4 9 3Statlog (Heart) 13 2 178 92 13 27 2

The Balance Scale dataset is generated to model psycho-logical experiments reported by Siegler [60] This datasetcontains 625 examples and each example is classified ashaving the balance scale tip to the right and tip to the leftor being balanced The attributes are the left weight the leftdistance the right weight and the right distance The correctway to find the class is the greater of (left distance lowast leftweight) and (right distance lowast right weight) If they are equalit is balanced

Habermanrsquos Survival dataset contains cases from a studythat was conducted between 1958 and 1970 at the Universityof Chicagorsquos Billings Hospital on the survival of patients whohad undergone surgery for breast cancerThedataset contains306 cases which record two survival status patients with ageof patient at time of operation patientrsquos year of operation andnumber of positive axillary nodes detected

The Liver Disorders dataset was donated by BUPAMedi-cal Research Ltd to record the liver disorder status in termsof a binary label The dataset includes values of 6 featuresmeasured for 345 male individuals The first 5 features areall blood tests which are thought to be sensitive to liverdisorders that might arise from excessive alcohol consump-tion These features are Mean Corpuscular Volume (MCV)alkaline phosphatase (ALKPHOS) alanine aminotransferase(SGPT) aspartate aminotransferase (SGOT) and gamma-glutamyl transpeptidase (GAMMAGT) The sixth feature isthe number of alcoholic beverage drinks per day (DRINKS)

The Seeds dataset consists of 210 patterns belonging tothree different varieties of wheat Kama Rosa and CanadianFrom each species there are 70 observations for area 119860perimeter 119875 compactness 119862 (119862 = 4 lowast 119901119894 lowast 1198601198752) length ofkernel width of kernel asymmetry coefficient and length ofkernel groove

The Wine dataset contains 178 instances recording theresults of a chemical analysis of wines grown in the sameregion in Italy but derived from three different cultivars Theanalysis determined the quantities of 13 constituents found ineach of the three types of wines

The Iris dataset contains 3 species of 50 instances eachwhere each species refers to a type of Iris plant (setosaversicolor and virginica) One species is linearly separablefrom the other 2 and the latter are not linearly separablefrom each other Each of the 3 species is classified by threeattributes sepal length sepal width petal length and petalwidth in cm This dataset was used by Fisher [61] in hisinitiation of the linear-discriminate-function technique

The Statlog (Heart) dataset is a heart disease databasecontaining 270 instances that consist of 13 attributes agesex chest pain type (4 values) resting blood pressure serumcholesterol inmgdL fasting blood sugargt 120mgdL restingelectrocardiographic results (values 0 1 and 2) maximumheart rate achieved exercise induced angina oldpeak = STdepression induced by exercise relative to rest the slope ofthe peak exercise ST segment number of major vessels (0ndash3)colored by fluoroscopy and thal 3 = normal 6 = fixed defect7 = reversible defect

52 Experimental Setup In this section the experimentswere done using a desktop computer with a 330GHz Intel(R)Core(TM) i5 processor 4GB of memory The entire algo-rithm was programmed inMATLAB R2012aThementioneddatasets are partitioned into 66 for training and 34 fortesting [33] All experiments are executed for 20 differentruns and each run includes 500 iterations The populationsize is considered as 30 and other control parameters of thecorresponding algorithms are given below

In CS the possibility of eggs being detected andthrown out of the nest is 119901119886 = 025In GA crossover rate 119875119862 = 05 mutate rate 119875119888 = 005In PSO the parameters are set to 1198621 = 1198622 = 2 weightfactor 119908 decreased linearly from 09 to 05

In BBO mutation probability 119901119898 = 01 the value forboth max immigration (119868) and max emigration (119864) is1 and the habitat modification probability 119901ℎ = 08In MVO exploitation accuracy is set to 119901 = 6 themin traveling distance rate is set to 02 and the maxtraveling distance rate is set to 1

In GSA 120572 is set to 20 the gravitational constant (1198660)is set to 1 and initial values of acceleration and massare set to 0 for each particle

All input features are mapped onto the interval of [minus1 1]for a small scale Here we apply min-max normalization toperform a linear transformation on the original data as givenin (12) where V1015840 is the normalized value of V in the range[minmax]

V1015840 = 2 lowast V minusminmaxminusmin

minus 1 (12)

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 6: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

6 Computational Intelligence and Neuroscience

Table 1 Description of datasets

Dataset Attribute Class Training sample Testing sample Input Hidden OutputBlood 4 2 493 255 4 9 2Balance Scale 4 3 412 213 4 9 3Habermanrsquos Survival 3 2 202 104 3 7 2Liver Disorders 6 2 227 118 6 13 2Seeds 7 3 139 71 7 15 3Wine 13 3 117 61 13 27 3Iris 4 3 99 51 4 9 3Statlog (Heart) 13 2 178 92 13 27 2

The Balance Scale dataset is generated to model psycho-logical experiments reported by Siegler [60] This datasetcontains 625 examples and each example is classified ashaving the balance scale tip to the right and tip to the leftor being balanced The attributes are the left weight the leftdistance the right weight and the right distance The correctway to find the class is the greater of (left distance lowast leftweight) and (right distance lowast right weight) If they are equalit is balanced

Habermanrsquos Survival dataset contains cases from a studythat was conducted between 1958 and 1970 at the Universityof Chicagorsquos Billings Hospital on the survival of patients whohad undergone surgery for breast cancerThedataset contains306 cases which record two survival status patients with ageof patient at time of operation patientrsquos year of operation andnumber of positive axillary nodes detected

The Liver Disorders dataset was donated by BUPAMedi-cal Research Ltd to record the liver disorder status in termsof a binary label The dataset includes values of 6 featuresmeasured for 345 male individuals The first 5 features areall blood tests which are thought to be sensitive to liverdisorders that might arise from excessive alcohol consump-tion These features are Mean Corpuscular Volume (MCV)alkaline phosphatase (ALKPHOS) alanine aminotransferase(SGPT) aspartate aminotransferase (SGOT) and gamma-glutamyl transpeptidase (GAMMAGT) The sixth feature isthe number of alcoholic beverage drinks per day (DRINKS)

The Seeds dataset consists of 210 patterns belonging tothree different varieties of wheat Kama Rosa and CanadianFrom each species there are 70 observations for area 119860perimeter 119875 compactness 119862 (119862 = 4 lowast 119901119894 lowast 1198601198752) length ofkernel width of kernel asymmetry coefficient and length ofkernel groove

The Wine dataset contains 178 instances recording theresults of a chemical analysis of wines grown in the sameregion in Italy but derived from three different cultivars Theanalysis determined the quantities of 13 constituents found ineach of the three types of wines

The Iris dataset contains 3 species of 50 instances eachwhere each species refers to a type of Iris plant (setosaversicolor and virginica) One species is linearly separablefrom the other 2 and the latter are not linearly separablefrom each other Each of the 3 species is classified by threeattributes sepal length sepal width petal length and petalwidth in cm This dataset was used by Fisher [61] in hisinitiation of the linear-discriminate-function technique

The Statlog (Heart) dataset is a heart disease databasecontaining 270 instances that consist of 13 attributes agesex chest pain type (4 values) resting blood pressure serumcholesterol inmgdL fasting blood sugargt 120mgdL restingelectrocardiographic results (values 0 1 and 2) maximumheart rate achieved exercise induced angina oldpeak = STdepression induced by exercise relative to rest the slope ofthe peak exercise ST segment number of major vessels (0ndash3)colored by fluoroscopy and thal 3 = normal 6 = fixed defect7 = reversible defect

52 Experimental Setup In this section the experimentswere done using a desktop computer with a 330GHz Intel(R)Core(TM) i5 processor 4GB of memory The entire algo-rithm was programmed inMATLAB R2012aThementioneddatasets are partitioned into 66 for training and 34 fortesting [33] All experiments are executed for 20 differentruns and each run includes 500 iterations The populationsize is considered as 30 and other control parameters of thecorresponding algorithms are given below

In CS the possibility of eggs being detected andthrown out of the nest is 119901119886 = 025In GA crossover rate 119875119862 = 05 mutate rate 119875119888 = 005In PSO the parameters are set to 1198621 = 1198622 = 2 weightfactor 119908 decreased linearly from 09 to 05

In BBO mutation probability 119901119898 = 01 the value forboth max immigration (119868) and max emigration (119864) is1 and the habitat modification probability 119901ℎ = 08In MVO exploitation accuracy is set to 119901 = 6 themin traveling distance rate is set to 02 and the maxtraveling distance rate is set to 1

In GSA 120572 is set to 20 the gravitational constant (1198660)is set to 1 and initial values of acceleration and massare set to 0 for each particle

All input features are mapped onto the interval of [minus1 1]for a small scale Here we apply min-max normalization toperform a linear transformation on the original data as givenin (12) where V1015840 is the normalized value of V in the range[minmax]

V1015840 = 2 lowast V minusminmaxminusmin

minus 1 (12)

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 7: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Computational Intelligence and Neuroscience 7

Table 2 MSE results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 295119864 minus 01 304119864 minus 01 310119864 minus 01 307119864 minus 01 300119864 minus 01 311119864 minus 01 330119864 minus 01Worst 305119864 minus 01 307119864 minus 01 333119864 minus 01 314119864 minus 01 392119864 minus 01 321119864 minus 01 417119864 minus 01Mean 301119864 minus 01 305119864 minus 01 323119864 minus 01 310119864 minus 01 318119864 minus 01 317119864 minus 01 378119864 minus 01Std 244119864 minus 03 709119864 minus 04 660119864 minus 03 247119864 minus 03 284119864 minus 02 297119864 minus 03 285119864 minus 02

Balance ScaleBest 700119864 minus 02 803119864 minus 02 137119864 minus 01 140119864 minus 01 852119864 minus 02 170119864 minus 01 297119864 minus 01Worst 129119864 minus 01 104119864 minus 01 166119864 minus 01 188119864 minus 01 124119864 minus 01 214119864 minus 01 801119864 minus 01Mean 105119864 minus 01 866119864 minus 02 152119864 minus 01 172119864 minus 01 102119864 minus 01 189119864 minus 01 465119864 minus 01Std 133119864 minus 02 606119864 minus 03 946119864 minus 03 132119864 minus 02 945119864 minus 03 118119864 minus 02 116119864 minus 01

Habermanrsquos SurvivalBest 295119864 minus 01 321119864 minus 01 364119864 minus 01 343119864 minus 01 313119864 minus 01 361119864 minus 01 371119864 minus 01Worst 337119864 minus 01 348119864 minus 01 407119864 minus 01 375119864 minus 01 351119864 minus 01 379119864 minus 01 477119864 minus 01Mean 318119864 minus 01 331119864 minus 01 382119864 minus 01 365119864 minus 01 331119864 minus 01 370119864 minus 01 427119864 minus 01Std 104119864 minus 02 663119864 minus 03 114119864 minus 02 784119864 minus 03 107119864 minus 02 583119864 minus 03 345119864 minus 02

Liver DisordersBest 326119864 minus 01 333119864 minus 01 417119864 minus 01 396119864 minus 01 316119864 minus 01 414119864 minus 01 527119864 minus 01Worst 386119864 minus 01 355119864 minus 01 471119864 minus 01 431119864 minus 01 587119864 minus 01 457119864 minus 01 673119864 minus 01Mean 349119864 minus 01 341119864 minus 01 444119864 minus 01 411119864 minus 01 382119864 minus 01 437119864 minus 01 595119864 minus 01Std 165119864 minus 02 617119864 minus 03 132119864 minus 02 106119864 minus 02 547119864 minus 02 112119864 minus 02 430119864 minus 02

SeedsBest 978119864 minus 04 144119864 minus 02 597119864 minus 02 449119864 minus 02 207119864 minus 03 719119864 minus 02 205119864 minus 01Worst 226119864 minus 02 330119864 minus 02 887119864 minus 02 102119864 minus 01 332119864 minus 01 122119864 minus 01 714119864 minus 01Mean 111119864 minus 02 221119864 minus 02 765119864 minus 02 817119864 minus 02 523119864 minus 02 980119864 minus 02 471119864 minus 01Std 536119864 minus 03 631119864 minus 03 882119864 minus 03 153119864 minus 02 963119864 minus 02 134119864 minus 02 120119864 minus 01

WineBest 635119864 minus 11 134119864 minus 06 587119864 minus 04 974119864 minus 03 562119864 minus 12 256119864 minus 02 560119864 minus 01Worst 855119864 minus 03 291119864 minus 05 190119864 minus 02 803119864 minus 02 342119864 minus 01 162119864 minus 01 895119864 minus 01Mean 440119864 minus 04 541119864 minus 06 321119864 minus 03 339119864 minus 02 517119864 minus 02 959119864 minus 02 703119864 minus 01Std 191119864 minus 03 623119864 minus 06 396119864 minus 03 155119864 minus 02 125119864 minus 01 348119864 minus 02 859119864 minus 02

IrisBest 510119864 minus 08 245119864 minus 02 441119864 minus 02 471119864 minus 02 170119864 minus 02 101119864 minus 02 119119864 minus 01Worst 267119864 minus 02 275119864 minus 02 231119864 minus 01 176119864 minus 01 465119864 minus 01 784119864 minus 02 651119864 minus 01Mean 142119864 minus 02 258119864 minus 02 683119864 minus 02 109119864 minus 01 691119864 minus 02 532119864 minus 02 366119864 minus 01Std 880119864 minus 03 840119864 minus 04 407119864 minus 02 355119864 minus 02 111119864 minus 01 193119864 minus 02 170119864 minus 01

Statlog (Heart)Best 898119864 minus 02 503119864 minus 02 135119864 minus 01 172119864 minus 01 803119864 minus 02 231119864 minus 01 452119864 minus 01Worst 126119864 minus 01 813119864 minus 02 192119864 minus 01 220119864 minus 01 165119864 minus 01 309119864 minus 01 697119864 minus 01Mean 109119864 minus 01 648119864 minus 02 162119864 minus 01 193119864 minus 01 127119864 minus 01 261119864 minus 01 553119864 minus 01Std 107119864 minus 02 911119864 minus 03 147119864 minus 02 132119864 minus 02 237119864 minus 02 186119864 minus 02 720119864 minus 02

53 Results and Discussion To evaluate the performance ofthe proposed method SOS with other six algorithms BBOCS GA GSA PSO and MVO experiments are conductedusing the given datasets In this work all datasets have beenpartitioned into two sets training set and testing set Thetraining set is used to train the network in order to achieve theoptimal weights and biasThe testing set is applied on unseendata to test the generalization performance of metaheuristicalgorithms on FNNs

Table 2 shows the best values of mean squared error(MSE) the worst values of MSE the mean of MSE andthe standard deviation for all training datasets Inspectingthe table of results it can be seen that SOS performs bestin datasets Seeds and Iris For datasets Liver DisordersHabermanrsquos Survival and Blood the best values worst valuesand mean values are all in the same order of magnitudesWhile the values of MSE are smaller in SOS than the otheralgorithms which means SOS is the best choice as the

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 8: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

8 Computational Intelligence and Neuroscience

50 100 150 200 250 300 350 400 450 5000Iterations

035

04

045

05

055

06

065

07

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 4 The convergence curves of algorithms (Blood)

03

032

034

036

038

04

042

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 5 The ANOVA test of algorithms for training Blood

trainingmethod on the three aforementioned datasetsMore-over it is ranked second for the dataset Wine and shows verycompetitive results compared to BBO In datasets BalanceScale and Statlog (Heart) the best values in results indicatethat SOS provides very close performances compared to BBOand MVO Also the three algorithms show improvementscompared to the others

Convergence curves for all metaheuristic algorithms areshown in Figures 4 6 8 10 12 14 16 and 18The convergencecurves show the average of 20 independent runs over thecourse of 500 iterations The figures show that SOS has thefastest convergence speed for training all the given datasetsFigures 5 7 9 11 13 15 17 and 19 show the boxplots relativeto 20 runs of SOS BBO GA MVO PSO GSA and CSThe boxplots which are used to analyze the variability ingetting MSE values indicate that SOS has greater value andless height than those of SOS GA CS PSO and GSA andachieves the similar results to MVO and BBO

Through 20 independent runs on the training datasetsthe optimal weights and biases are achieved and then usedto test the classification accuracy on the testing datasets Asdepicted in Table 3 the rank is in terms of the best values

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 6 The convergence curves of algorithms (Iris)

CS GA MVO GSA PSO SOSBBOAlgorithms

0

01

02

03

04

05

06

MSE

Figure 7 The ANOVA test of algorithms for training Iris

03504

04505

05506

06507

07508

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 8 The convergence curves of algorithms (Liver Disorders)

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 9: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Computational Intelligence and Neuroscience 9

03

035

04

045

05

055

06

065

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 9 The ANOVA test of algorithms for training Liver Disor-ders

1

50 100 150 200 250 300 350 400 450 5000Iterations

0

02

04

06

08

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

BBOCSGAMVO

GSAPSOSOS

Figure 10 The convergence curves of algorithms (Balance Scale)

01

02

03

04

05

06

07

08

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 11TheANOVA test of algorithms for training Balance Scale

0

02

04

06

08

1

12

14

16

18

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 12 The convergence curves of algorithms (Seeds)

0

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 13 The ANOVA test of algorithms for training Seeds

0

01

02

03

04

05

06

07

08

09

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 14 The convergence curves of algorithms (Statlog (Heart))

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 10: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

10 Computational Intelligence and Neuroscience

01

02

03

04

05

06

07

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 15 The ANOVA test of algorithms for training Statlog(Heart)

Aver

age M

SE o

f tra

inin

g sa

mpl

e

035

04

045

05

055

06

065

07

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 16 The convergence curves of algorithms (HabermanrsquosSurvival)

CS GA MVO GSA PSO SOSBBOAlgorithms

03032034036038

04042044046048

MSE

Figure 17 The ANOVA test of algorithms for training HabermanrsquosSurvival

002040608

112141618

2

Aver

age M

SE o

f tra

inin

g sa

mpl

e

50 100 150 200 250 300 350 400 450 5000Iterations

BBOCSGAMVO

GSAPSOSOS

Figure 18 The convergence curves of algorithms (Wine)

0010203040506070809

MSE

CS GA MVO GSA PSO SOSBBOAlgorithms

Figure 19 The ANOVA test of algorithms for training Wine

in each dataset and SOS provides the best performances ontesting datasets Blood Seeds and Iris For dataset Wine theclassification accuracy of SOS is 983607 which indicatesthat only one example in testing dataset cannot be classifiedcorrectly It is noticeable that though MVO has the highestclassification accuracy in datasets Balance Scale HabermanrsquosSurvival Liver Disorders and Statlog (Heart) SOS alsoperforms well in classification However the accuracy shownin GA is the lowest among the tested algorithms

This comprehensive comparative study shows that theSOS algorithm is superior among the compared trainers inthis paper It is a challenge for training FNN due to the largenumber of local solutions in solving this problemOn accountof being simpler andmore robust than competing algorithmsSOS performs well in most of the datasets which shows howflexible this algorithm is for solving problems with diversesearch space Further in order to determine whether theresults achieved by the algorithms are statistically differentfrom each other a nonparametric statistical significanceproof known as Wilcoxonrsquos rank sum test for equal medians[62 63] was conducted between the results obtained by thealgorithms SOS versus CS SOS versus PSO SOS versus GASOS versus MVO SOS versus GSA and SOS versus BBO

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 11: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Computational Intelligence and Neuroscience 11

Table 3 Accuracy results

Dataset AlgorithmSOS MVO GSA PSO BBO CS GA

BloodBest 827451 811765 780392 803922 811765 788235 768627Worst 776471 800000 333333 776471 725490 745098 670588Mean 798039 807451 741765 792157 775294 769608 727843Rank 1 2 6 4 2 5 7

Balance ScaleBest 920188 929577 910798 892019 915493 906103 802817Worst 868545 892019 859155 835681 882629 821596 380282Mean 900235 914319 877465 867136 901643 863146 597653Rank 2 1 4 6 3 5 7

Habermanrsquos SurvivalBest 817308 826923 817308 826923 826923 817308 788462Worst 711538 759615 740385 769231 692308 740385 653846Mean 760577 795673 791827 804808 770192 782692 745673Rank 4 1 4 1 1 6 7

Liver DisordersBest 754237 762712 644068 720339 728814 677966 550847Worst 661017 720339 67797 593220 457627 474576 271186Mean 710593 741949 483051 664831 651271 561441 430508Rank 2 1 6 4 3 5 7

SeedsBest 957746 957746 943662 929577 943662 915493 676056Worst 873239 901408 859155 788732 619718 774648 281690Mean 913380 934507 903521 874648 873239 822535 511268Rank 1 1 3 5 3 6 7

WineBest 983607 983607 1000000 1000000 1000000 918033 491803Worst 918033 967213 918033 885246 590164 786885 180328Mean 954918 979508 961475 960656 904918 833607 326230Rank 4 4 1 1 1 6 7

IrisBest 980392 980392 980392 980392 980392 980392 980392Worst 647059 980392 333333 529412 294118 529412 78431Mean 920588 980392 937255 914706 829412 850000 560784Rank 1 1 1 1 1 1 1

Statlog (Heart)Best 858696 880435 869565 869565 804348 836957 706522Worst 771739 771739 336957 771739 663043 684783 391304Mean 822283 826087 791848 828804 755435 774457 505435Rank 4 1 2 2 6 5 7

In order to draw a statistically meaningful conclusion testsare performed on the optimal fitness for training datasetsand P values are computed as shown in Table 4 Rank sumtests the null hypothesis that the two datasets are samplesfrom continuous distributions with equal medians againstthe alternative that they are not Almost all values reportedin Table 4 are less than 005 (5 significant level) which isstrong evidence against the null hypothesis Therefore suchevidence indicates that SOS results are statistically significant

and that it has not occurred by coincidence (ie due tocommon noise contained in the process)

54 Analysis of the Results Statistically speaking the SOSalgorithm provides superior local avoidance and the highclassification accuracy in training FNNs According to themathematical formulation of the SOS algorithm the firsttwo interaction phases are devoted to exploration of thesearch space This promotes exploration of the search space

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 12: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

12 Computational Intelligence and Neuroscience

Table 4 P values produced by Wilcoxonrsquos rank sum test for equal medians

Dataset SOS versusCS PSO GA MVO GSA BBO

Blood 680119864 minus 08 680119864 minus 08 680119864 minus 08 307119864 minus 06 680119864 minus 08 468119864 minus 05Balance Scale 680119864 minus 08 680119864 minus 08 680119864 minus 08 975119864 minus 06 680119864 minus 08 229119864 minus 01Habermanrsquos Survival 680119864 minus 08 680119864 minus 08 680119864 minus 08 179119864 minus 04 680119864 minus 08 256119864 minus 03Liver Disorders 680119864 minus 08 680119864 minus 08 680119864 minus 08 126119864 minus 01 680119864 minus 08 135119864 minus 03Seeds 680119864 minus 08 680119864 minus 08 680119864 minus 08 399119864 minus 06 680119864 minus 08 163119864 minus 03Wine 678119864 minus 08 680119864 minus 08 680119864 minus 08 148119864 minus 03 105119864 minus 06 598119864 minus 01Iris 296119864 minus 06 647119864 minus 08 647119864 minus 08 762119864 minus 07 647119864 minus 08 573119864 minus 05Statlog (Heart) 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 680119864 minus 08 604119864 minus 03

that leads to finding the optimal weights and biases Forthe exploitation phase the third interaction phase of SOSalgorithm is helpful for resolving local optima stagnationThe results of this work show that although metaheuristicoptimizations have high exploration the problem of trainingan FNN needs high local optima avoidance during the wholeoptimization process The results prove that the SOS is veryeffective in training FNNs

It is worth discussing the poor performance of GA inthis subsection The rate of crossover and mutation are twospecific tuning parameters inGA dependent on the empiricalvalue for particular problems This is the reason why GAfailed to provide good results for all the datasets In thecontrast SOS uses only the two parameters of maximumevaluation number and population size so it avoids the riskof compromised performance due to improper parametertuning and enhances performance stability Easy to fall intolocal optimal and low efficiency in the latter of search periodare the other two reasons for the poor performance of GAAnother finding in the results is the good performances ofBBO and MVO which are benefit from the mechanism forsignificant abrupt movements in the search space

The reason for the high classification rate provided bySOS is that this algorithm is equipped with adaptive threephases to smoothly balance exploration and exploitationThefirst two phases are devoted to exploration and the rest toexploitation And the three phases are simple to operate withonly simple mathematical operations to code In additionSOS uses greedy selection at the end of each phase to selectwhether to retain the old ormodified solution Consequentlythere are always guiding search agents to the most promisingregions of the search space

6 Conclusions

In this paper the recently proposed SOS algorithm wasemployed for the first time as a FNN trainer The high levelof exploration and exploitation of this algorithm were themotivation for this studyThe problem of training a FNNwasfirst formulated for the SOS algorithm This algorithm wasthen employed to optimize the weights and biases of FNNsso as to get high classification accuracy The obtained resultsof eight datasets with different characteristic show that theproposed approach is efficient to train FNNs compared to

other training methods that have been used in the literaturesCS PSO GA MVO GSA and BBO The results of MSEover 20 runs show that the proposed approach performsbest in terms of convergence rate and is robust since thevariances are relatively small Furthermore by comparingthe classification accuracy of the testing datasets using theoptimal weights and biases SOS has advantage over the otheralgorithms employed In addition the significance of theresults is statistically confirmed by usingWilcoxonrsquos rank sumtest which demonstrates that the results have not occurred bycoincidence It can be concluded that SOS is suitable for beingused as a training method for FNNs

For future work the SOS algorithm will be extendedto find the optimal number of layers hidden nodes andother structural parameters of FNNs More elaborate tests onhigher dimensional problems and large number of datasetswill be done Other types of neural networks such as radialbasis function (RBF) neural network are worth furtherresearch

Competing Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is supported by National Science Foundationof China under Grants no 61463007 and 61563008 andthe Guangxi Natural Science Foundation under Grant no2016GXNSFAA380264

References

[1] J A Anderson An Introduction to Neural Networks MIT Press1995

[2] G P Zhang and M Qi ldquoNeural network forecasting for sea-sonal and trend time seriesrdquo European Journal of OperationalResearch vol 160 no 2 pp 501ndash514 2005

[3] C-J Lin C-H Chen and C-Y Lee ldquoA self-adaptive quantumradial basis function network for classification applicationsrdquo inProceedings of the IEEE International Joint Conference on NeuralNetworks vol 4 pp 3263ndash3268 IEEE Budapest Hungary July2004

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 13: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Computational Intelligence and Neuroscience 13

[4] X-F Wang and D-S Huang ldquoA novel density-based clusteringframework by using level set methodrdquo IEEE Transactions onKnowledge and Data Engineering vol 21 no 11 pp 1515ndash15312009

[5] N A Mat Isa and W M F W Mamat ldquoClustered-Hybrid Mul-tilayer Perceptron network for pattern recognition applicationrdquoApplied SoftComputing Journal vol 11 no 1 pp 1457ndash1466 2011

[6] D S Huang Systematic Theory of Neural Networks for PatternRecognition Publishing House of Electronic Industry of China1996 (Chinese)

[7] Z-Q Zhao D-S Huang and B-Y Sun ldquoHuman face recogni-tion based onmulti-features using neural networks committeerdquoPattern Recognition Letters vol 25 no 12 pp 1351ndash1358 2004

[8] O Nelles ldquoNonlinear system identification from classicalapproaches to neural networks and fuzzy modelsrdquo AppliedTherapeutics vol 6 no 7 pp 21ndash717 2001

[9] B Malakooti and Y Q Zhou ldquoApproximating polynomialfunctions by feedforward artificial neural networks capacityanalysis and designrdquo Applied Mathematics and Computationvol 90 no 1 pp 27ndash51 1998

[10] A Cochocki and R UnbehauenNeural Networks for Optimiza-tion and Signal Processing John Wiley amp Sons New York NYUSA 1993

[11] X-F Wang D-S Huang and H Xu ldquoAn efficient local Chan-Vese model for image segmentationrdquo Pattern Recognition vol43 no 3 pp 603ndash618 2010

[12] D-S Huang ldquoA constructive approach for finding arbitraryroots of polynomials by neural networksrdquo IEEE Transactions onNeural Networks vol 15 no 2 pp 477ndash491 2004

[13] G Bebis and M Georgiopoulos ldquoFeed-forward neural net-worksrdquo IEEE Potentials vol 13 no 4 pp 27ndash31 1994

[14] T Kohonen ldquoThe self-organizing maprdquo Neurocomputing vol21 no 1ndash3 pp 1ndash6 1998

[15] J Park and I W Sandberg ldquoApproximation and radial-basis-function networksrdquo Neural Computation vol 3 no 2 pp 246ndash257 1993

[16] D-S Huang ldquoRadial basis probabilistic neural networksmodeland applicationrdquo International Journal of Pattern Recognitionand Artificial Intelligence vol 13 no 7 pp 1083ndash1101 1999

[17] D-S Huang and J-X Du ldquoA constructive hybrid structureoptimization methodology for radial basis probabilistic neuralnetworksrdquo IEEE Transactions onNeural Networks vol 19 no 12pp 2099ndash2115 2008

[18] G Dorffner ldquoNeural networks for time series processingrdquoNeural Network World vol 6 no 1 pp 447ndash468 1996

[19] S Ghosh-Dastidar and H Adeli ldquoSpiking neural networksrdquoInternational Journal of Neural Systems vol 19 no 4 pp 295ndash308 2009

[20] D E Rumelhart E G Hinton and R J Williams ldquoLearningrepresentations by back-propagating errorsrdquo in Neurocomput-ing Foundations of Research vol 5 no 3 pp 696ndash699 MITPress Cambridge Mass USA 1988

[21] N Zhang ldquoAn online gradient method with momentum fortwo-layer feedforward neural networksrdquo Applied Mathematicsand Computation vol 212 no 2 pp 488ndash498 2009

[22] J H HollandAdaptation in Natural and Artificial Systems MITPress Cambridge Mass USA 1992

[23] D J Montana and L Davis ldquoTraining feedforward neuralnetworks using genetic algorithmsrdquo in Proceedings of the Inter-national Joint Conferences on Artificial Intelligence (IJCAI rsquo89)vol 89 pp 762ndash767 Detroit Mich USA 1989

[24] C R Hwang ldquoSimulated annealing theory and applicationsrdquoActa Applicandae Mathematicae vol 12 no 1 pp 108ndash111 1988

[25] D Shaw andWKinsner ldquoChaotic simulated annealing inmulti-layer feedforward networksrdquo in Proceedings of the 1996Canadian Conference on Electrical and Computer Engineering(CCECErsquo96) part 1 (of 2) pp 265ndash269 University of CalgaryAlberta Canada May 1996

[26] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybrid par-ticle swarm optimization-back-propagation algorithm for feed-forward neural network trainingrdquo Applied Mathematics ampComputation vol 185 no 2 pp 1026ndash1037 2007

[27] E Rashedi H Nezamabadi-Pour and S Saryazdi ldquoGSA agravitational search algorithmrdquo Information Sciences vol 179no 13 pp 2232ndash2248 2009

[28] S A Mirjalili S Z M Hashim and HM Sardroudi ldquoTrainingfeedforward neural networks using hybrid particle swarm opti-mization and gravitational search algorithmrdquo Applied Mathe-matics and Computation vol 218 no 22 pp 11125ndash11137 2012

[29] Z Beheshti S M H Shamsuddin E Beheshti and S SYuhaniz ldquoEnhancement of artificial neural network learningusing centripetal accelerated particle swarm optimization formedical diseases diagnosisrdquo Soft Computing vol 18 no 11 pp2253ndash2270 2014

[30] L A M Pereira D Rodrigues P B Ribeiro J P Papa and SA T Weber ldquoSocial-spider optimization-based artificial neuralnetworks training and its applications for Parkinsonrsquos diseaseidentificationrdquo in Proceedings of the 27th IEEE InternationalSymposiumonComputer-BasedMedical Systems (CBMS rsquo14) pp14ndash17 IEEE New York NY USA May 2014

[31] E Uzlu M Kankal A Akpınar and T Dede ldquoEstimates ofenergy consumption in Turkey using neural networks with theteaching-learning-based optimization algorithmrdquo Energy vol75 pp 295ndash303 2014

[32] P A Kowalski and S Łukasik ldquoTraining neural networks withKrill Herd algorithmrdquo Neural Processing Letters vol 44 no 1pp 5ndash17 2016

[33] H Faris I Aljarah and S Mirjalili ldquoTraining feedforwardneural networks using multi-verse optimizer for binary classifi-cation problemsrdquoApplied Intelligence vol 45 no 2 pp 322ndash3322016

[34] J Nayak BNaik andH S Behera ldquoA novel nature inspired fire-fly algorithm with higher order neural network performanceanalysisrdquo Engineering Science and Technology vol 19 no 1 pp197ndash211 2016

[35] C Blum and K Socha ldquoTraining feed-forward neural networkswith ant colony optimization an application to pattern clas-sificationrdquo in Proceedings of the 5th International Conferenceon Hybrid Intelligent Systems (HIS rsquo05) pp 233ndash238 IEEEComputer Society Rio de Janeiro Brazil 2005

[36] K Socha and C Blum ldquoAn ant colony optimization algorithmfor continuous optimization application to feed-forward neuralnetwork trainingrdquo Neural Computing and Applications vol 16no 3 pp 235ndash247 2007

[37] E Valian S Mohanna and S Tavakoli ldquoImproved cuckoosearch algorithm for feedforward neural network trainingrdquoInternational Journal of Artificial IntelligenceampApplications vol2 no 3 pp 36ndash43 2011

[38] D Karaboga B Akay and C Ozturk ldquoArtificial Bee Colony(ABC) optimization algorithm for training feed-forward neuralnetworksrdquo in Proceedings of the International Conference onModeling Decisions for Artificial Intelligence (MDAI rsquo07) pp318ndash329 Springer Kitakyushu Japan 2007

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 14: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

14 Computational Intelligence and Neuroscience

[39] C Ozturk andD Karaboga ldquoHybrid Artificial Bee Colony algo-rithm for neural network trainingrdquo in Proceedings of the IEEECongress of Evolutionary Computation (CEC rsquo11) pp 84ndash88IEEE New Orleans La USA June 2011

[40] L A M Pereira L C S Afonso J P Papa et al ldquoMultilayerperceptron neural networks training through charged systemsearch and its application for non-technical losses detectionrdquoin Proceedings of the 2nd IEEE Latin American Conference onInnovative Smart Grid Technologies (ISGT LA rsquo13) pp 1ndash6 IEEESao Paulo Brazil April 2013

[41] S Mirjalili ldquoHow effective is the Grey Wolf optimizer in train-ing multi-layer perceptronsrdquo Applied Intelligence vol 43 no 1pp 150ndash161 2015

[42] P Moallem and N Razmjooy ldquoA multi layer perceptron neuralnetwork trained by invasive weed optimization for potato colorimage segmentationrdquo Trends in Applied Sciences Research vol7 no 6 pp 445ndash455 2012

[43] S Mirjalili S M Mirjalili and A Lewis ldquoLet a biogeography-based optimizer train yourmulti-layer perceptronrdquo InformationSciences An International Journal vol 269 pp 188ndash209 2014

[44] M-Y Cheng and D Prayogo ldquoSymbiotic organisms searcha new metaheuristic optimization algorithmrdquo Computers ampStructures vol 139 pp 98ndash112 2014

[45] M Cheng D Prayogo and D Tran ldquoOptimizing multiple-resources leveling in multiple projects using discrete symbioticorganisms searchrdquo Journal of Computing in Civil Engineeringvol 30 no 3 2016

[46] R Eki F Vincent Y S Budi and A A N Perwira Redi ldquoSymbi-otic Organism Search (SOS) for solving the capacitated vehiclerouting problemrdquo World Academy of Science Engineering andTechnology International Journal of Mechanical AerospaceIndustrial Mechatronic and Manufacturing Engineering vol 9no 5 pp 807ndash811 2015

[47] D Prasad and V Mukherjee ldquoA novel symbiotic organismssearch algorithm for optimal power flow of power system withFACTS devicesrdquo Engineering Science amp Technology vol 19 no1 pp 79ndash89 2016

[48] M Abdullahi M A Ngadi and S M Abdulhamid ldquoSymbioticorganism Search optimization based task scheduling in cloudcomputing environmentrdquo Future Generation Computer Systemsvol 56 pp 640ndash650 2016

[49] S Verma S Saha and V Mukherjee ldquoA novel symbioticorganisms search algorithm for congestion management inderegulated environmentrdquo Journal of Experimental and Theo-retical Artificial Intelligence pp 1ndash21 2015

[50] D-H Tran M-Y Cheng and D Prayogo ldquoA novel MultipleObjective Symbiotic Organisms Search (MOSOS) for time-cost-labor utilization tradeoff problemrdquo Knowledge-Based Sys-tems vol 94 pp 132ndash145 2016

[51] V F Yu A A Redi C Yang E Ruskartina and B SantosaldquoSymbiotic organisms search and two solution representationsfor solving the capacitated vehicle routing problemrdquo AppliedSoft Computing 2016

[52] A Panda and S Pani ldquoA SymbioticOrganisms Search algorithmwith adaptive penalty function to solve multi-objective con-strained optimization problemsrdquo Applied Soft Computing vol46 pp 344ndash360 2016

[53] S Banerjee and S Chattopadhyay ldquoPower optimization ofthree dimensional turbo code using a novel modified symbioticorganism search (MSOS) algorithmrdquoWireless Personal Commu-nications pp 1ndash28 2016

[54] B Das V Mukherjee and D Das ldquoDG placement in radial dis-tribution network by symbiotic organisms search algorithm forreal power loss minimizationrdquo Applied Soft Computing vol 49pp 920ndash936 2016

[55] M K Dosoglu U Guvenc S Duman Y Sonmez andH T Kahraman ldquoSymbiotic organisms search optimizationalgorithm for economicemission dispatch problem in powersystemsrdquo Neural Computing and Applications 2016

[56] R Hecht-Nielsen ldquoKolmogorovrsquos mapping neural networkexistence theoremrdquo in Proceedings of the IEEE 1st InternationalConference on Neural Networks vol 3 pp 11ndash13 IEEE Press SanDiego Calif USA 1987

[57] J-R Zhang J Zhang T-M Lok and M R Lyu ldquoA hybridparticle swarm optimization-back-propagation algorithm forfeedforward neural network trainingrdquoAppliedMathematics andComputation vol 185 no 2 pp 1026ndash1037 2007

[58] C L Blake andC JMerz UCI Repository ofMachine LearningDatabases httparchiveicsuciedumldatasetshtml

[59] I-C Yeh K-J Yang and T-M Ting ldquoKnowledge discoveryon RFM model using Bernoulli sequencerdquo Expert Systems withApplications vol 36 no 3 pp 5866ndash5871 2009

[60] R S Siegler ldquoThree aspects of cognitive developmentrdquo Cogni-tive Psychology vol 8 no 4 pp 481ndash520 1976

[61] R A Fisher ldquoHas Mendelrsquos work been rediscoveredrdquo Annals ofScience vol 1 no 2 pp 115ndash137 1936

[62] F Wilcoxon ldquoIndividual comparisons by ranking methodsrdquoBiometrics Bulletin vol 1 no 6 pp 80ndash83 1945

[63] J Derrac S Garcıa D Molina and F Herrera ldquoA practicaltutorial on the use of nonparametric statistical tests as amethodology for comparing evolutionary and swarm intelli-gence algorithmsrdquo Swarm and Evolutionary Computation vol1 no 1 pp 3ndash18 2011

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Page 15: Research Article Training Feedforward Neural Networks Using …downloads.hindawi.com/journals/cin/2016/9063065.pdf · 2019-07-30 · Research Article Training Feedforward Neural Networks

Submit your manuscripts athttpwwwhindawicom

Computer Games Technology

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Distributed Sensor Networks

International Journal of

Advances in

FuzzySystems

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014

International Journal of

ReconfigurableComputing

Hindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

Artificial Intelligence

HindawithinspPublishingthinspCorporationhttpwwwhindawicom Volumethinsp2014

Advances inSoftware EngineeringHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporation

httpwwwhindawicom Volume 2014

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

ArtificialNeural Systems

Advances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

RoboticsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Computational Intelligence and Neuroscience

Industrial EngineeringJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Modelling amp Simulation in EngineeringHindawi Publishing Corporation httpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Human-ComputerInteraction

Advances in

Computer EngineeringAdvances in

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014