bacterial foraging trained wavelet neural networks ......1 bacterial foraging trained wavelet neural...

1

Bacterial foraging trained wavelet neural

networks: Application to bankruptcy

prediction in banks

Project Report

Institute for Research and Development in Banking Technology

(IDRBT)

Road No.1, Castle Hills ,Masab Tank,

Hyderabad-500057

Project Supervisor:

Dr. V. Ravi

(Assistant Professor,IDRBT

Submitted by:

Paramjeet(07HS2004)

2nd

year Student

Integrated M. Sc. in Economics

Department of Humanities and Social Sciences

IIT Kharagpur

Kharagpur West Bengal 721302

2

CERTIFICATE

This is to certify that this project has been successfully completed to

my satisfaction and that the goals set upon at the outset of this

endeavor have been worked upon to the best of student’s abilities and

resources. I hereby allow this project to be presented for evaluation

with my full consent.

Supervisor:

Dr. V. Ravi

(Assistant Professor, IDRBT )

3

ACKNOWLEDGEMENT

I would like to thank my supervisor, Dr. V. Ravi, who guided me through the

project ,and helped me sort out all the problems ,be it technical or otherwise; and

without whose support ,the project would not have reached its present state.

I would also like to thank Mr. Nikunj Chauhan(ex. M Tech student at IDRBT) for

helping me to develop my final algorithm.

Paramjeet

Project Supervisor

Dr. V. Ravi

4

Table of contents

Serial Description Page

Number Number

1 Certificate 2

2 Acknowledgement 3

3 Abstract and Keywords 4

4 Nomenclature and Abbreviations 6

5 Introduction 7

6 Bacterial Foraging Technology 8

7 BFOA Algorithm 10

8 Wavelet Neural Network 12

9 Training of WNN with BFT 16

10 Bankruptcy Prediction 18

11 Results & Discussion 20

13 Conclusion 23

5

ABSTRACT

The present report proposes training of wavelet neural network (WNN) with the newly

proposed bacterial foraging technique optimization algorithm in order to predict bankruptcy in

banks. The parameters translation, dilation and the weights connecting different layers in WNN

are tuned using BFT algorithm. The resulting neural network is called BFTWNN. The

performance of BFTWNN is compared with that of threshold accepting wavelet trained

wavelet neural network (TAWNN) [Vinay Kumar et al.[38]] and the original WNN. The

efficacy of BFTWNN is tested on bankruptcy prediction datasets viz. US banks, Turkish banks

and Spanish banks with full features. Further, It is also tested on benchmark datasets such as

Iris, Wine and Wisconsin Breast Cancer with full features. The whole experimentation is

conducted using 10-fold cross validation method. BFTWNN outperformed TAWNN and WNN

in benchmark dataset problems with good margin and it yielded comparable results as

Differential Evolution Wavelet Neural Network (DEWNN) developed by Chauhan et al. [26].

Keywords: Bacterial Foraging Technique, Wavelet Neural Network, Bankruptcy Prediction,

Classification, Bacterial foraging trained wavelet neural network (BFTWNN), Threshold

Accepting trained wavelet neural network (TAWNN).

6

Nomenclature and abbreviations

Nomenclature

n_c No of chemo tactic step

n_s No of swim step

n_r No of reproduction step

n_ed No of elimination dispersion step

p_ed probability of elimination and dispersion

p dimension of search space

BFT Bacterial Foraging technique

WNN Wavelet Neural Network

NWT Non Decimated Wavelet Transform

ANN Artificial Neural Network

BFTWNN Bacterial Foraging Trained Wavelet Neural Network

TAWNN Threshold Acceptance Wavelet Neural Network

DEWNN Differential Evolution Wavelet Neural Network

BPNN Back Propagation Neural Network

AUC Area Under receiver characteristic Curve

CAMELS Capital Adequacy, Asset Quality, Management

Expertise, Earning Strength, Liquidity,

Sensitivity to market risk.

NRMSE Normal Root Mean Square Error

7

1. INTRODUCTION

To tackle complex search problem of real world, scientists have been drawing inspiration

from nature and natural creatures over the years. Darwinian evolution, group behavior of social

insects and the foraging strategy of microbial organisms are some of the example in this

category. The core of all these animal search strategies is optimization. They usually try to

maximize certain things in their searching strategy.

Bacterial Foraging Technique is completely based on the foraging strategy of Escherichia Coli

(E. coli.) bacteria that is found in our intestine. It was proposed by Passino [20] in 2002 . It is

basically an evolutionary algorithm, Over the years, animals have developed foraging strategy

that maximize a function like E/T, where E is the energy obtained from a prey and T is the time

taken during whole process (i.e. from searching the prey, locating and time taken to digest it).

The maximization of this function ensures that the animals get more time for other activities like

fighting, fleeing, mating and shelter building. It is likely that animals with better foraging

strategy will survive and animals with poor foraging strategy will get eliminated. The best part of

BFT is that it is a derivative free method just as other metaheuristics.

BFT has now gained very much popularity and is gaining wide acceptance in solving a whole

range of problems. Acharya et al. [1] applied BFT to convert non Gaussian data to independent

linear form to recover all the component of a given source. Dasgupta et al. [34] applied adaptive

computational chemotaxis in BFT to solve the problem of slower convergence of BFT near the

global minima value and to take out the bacteria if it gets trapped in the local minima.

The BFT trained WNN is further applied for short term load forecasting by Ulagammai [37] used

BFT to a feed forward neural networks preceded by wavelet transformers. The inputs are fed

as the time series signal. Nondecimated wavelet transform (NWT) is used as the presignal

processor in the model and it is decomposed on number of wavelet coefficient and these are then

fed to multilayer network. The output obtained is further combined using wavelet recombination

and the output thus obtained is the final output.

In this paper, we propose a BFT based algorithm to train a Wavelet neural network (WNN) and

test it’s effectiveness on bank’s bankruptcy datasets. In this algorithm ,the weights connecting

8

input and hidden layers, hidden and output layers and dilation, translation parameters are

updated using BFT algorithm.

2. Overview of BFT algorithm

BFT is completely based on the foraging technique of the E. Coli (Escherichia coli) bacteria

found in the lower intestine of warm blooded organisms. It follows the saltatory search method

for searching nutrients. The bacteria has flagella on it’s body in order to facilitate in moving.

Motion of the flagella decides in which direction it has to move, like if the flagella moves in the

clockwise direction then all the flagella move independently of the one another resulting in what

we call as the tumble step this step is used for searching nutrient rich places or to move to the

place that is away from the harmful substance (like various ions, acidic or basic environment)

Oon the other hand if the flagella rotate in anticlockwise direction then the flagella respond by

forming a bundle and help it to propagate further in their direction, this is called a swim step it is

taken when the bacteria is moving in the direction of increasing nutrient concentration. The

tumble step determines the direction in which it has to move and swim step is taken in the

direction of tumble step.

The bacterial foraging system consists of four principal steps namely chemotaxis step, swarming,

reproduction step and elimination dispersion step. These steps are described briefly as follows.

2.1 Chemotaxis :

Chemotaxis term refers to the step taken due to presence of the chemical substance in the nearby

area. If the substance is of the nutrient type then the bacteria will get attracted to it and will take

a step in that direction else the bacteria will take a step away from it in order to avoid it. This

step consist of two types of movements: tumble step and swim step. Tumble step basically

determines the direction in which swim step has to be taken. This step basically searches the

direction in which the nutrient concentration is increasing. After it had taken a tumble step it has

to take swim step. This is done by moving in the direction specified by tumble. But it can move

up to a predefined number of maximum steps. After that it has to take a tumble step. It has to be

kept in mind that bacteria will not take a step if the nutrient concentration is less than the

previous one. Suppose the bacteria is at the thj chemo tactic step, thk reproduction step and thl

elimination dispersion step with step size as C(i), then the movement can be represented by

9

( )( 1, , ) ( , , ) ( )

( ) ( )

i i

T

ij k l j k l C i

i iθ θ

∆+ = +

∆ ∆

Where ∆ indicates a vector in the random direction having the elements in all the directions,T∆

represents the transpose of the vector. If the number of chemotactic steps is less than the

specified level then it is repeated again, else this process is stopped.

2.2 Swarming:

An interesting behavior is shown by some of the bacteria including E. Coli and S. Typhimurium,

where intricate and stable spatio-temporal patterns (swarms) are formed in semisolid nutrient

medium. A group of E. Coli cells arrange themselves in a traveling ring by moving up the

nutrient gradient when placed amidst a semisolid matrix with a single nutrient chemo- effecter

.The cells when stimulated by a high amount of succinate, release an attractant aspertate, which

helps them make a group and move as concentric patterns of swarms with high bacterial density.

The cell to cell factor can be calculated with the help of following function.

1

( , ( , , )) ( , ( , , ))S

i

cc cc

i

J P j k l J j k lθ θ θ=

= ∑

2 2

tan

1 1

[ exp( ( ) )] [ exp( ( ) )]S S

i i

attrac t attract m m repellant repellant m m

i i

d w h wθ θ θ θ= =

= − − − + − − −∑ ∑ ∑ ∑

Where d tanattrac t is the depth of the attractant released by the cell ; w attract is the width of the

attractant signal ; h repellant is the height of the repellant effect and repellant

h is the measure of the

width of repellant. Jcc

( )),,(,( lkjPθ is the objective function value to be added to the actual

objective function (to be minimized) to present a time varying objective function, S is the total

number of bacteria ,p is the number of variables to be optimized. For small step size the value of

cell to cell factor is close to zero.

2.3 Reproduction:

(A) For the given k and l ,and for each i=1,2,….S

Let J I

health = ∑

+

=

1

1

CN

j

),,,( lkjiJ be the health value of the bacteria .Sort bacteria in

order of ascending values ( J )health ).

10

(B) Each bacterium is sorted out in decreasing order of their health value or increasing order

of health value (it is the summation of the entire nutrient it has taken during all

chemotactic step it has taken). The first half of the bacteria with the high health value is

killed and the other half of the bacteria undergoes reproduction step and the offspring

that are produced are placed at the exactly same location as their parent, this keeps the

population size constant. If the number of reproduction steps is less than a specified

value then this step is repeated again.

2.4 Elimination and Dispersal Step:

There may be some changes like increase in the temperature in a local region or there may be

sudden increase in the acidity of the region which results in displacing of some of the bacteria or

killing some of the bacteria. These events are random in nature and these are simulated in the

algorithm by displacing a part of population into a new location. Some bacteria are randomly

chosen and are displaced to new location. This event may help if the new locations are near the

global minimum region or there may be reverse case when bacterium near the global minimum

region gets displaced to other region. The probability of happening of these events is decided by

the parameter ped

(probability of elimination and dispersion). If the current number of

elimination and dispersal event is less than a specified number, then this process is repeated

again, Else this loop is finished.

The BFOA Algorithm

Parameters:

[STEP 1] Initialization:

The first step is the initialization of all the parameters p, S, NC

, Nr, Ned

, C(i)(i=1,2,3,…..S),

iθ (i=1,2,3…S).

p: Dimension of the search space.

S: The number of bacteria in the population.

Nc: The number of chemotactic steps.

Nr: The number of reproduction steps.

Ned

: The number of elimination dispersal steps.

11

ped

: Elimination Dispersal Probability.

C(i): The size of the step taken in random direction specified by tumble.

Ns : Number of swim steps taken.

Algorithm:

[Step 2] Elimination Dispersal loop: l=l+1

[Step 3] Reproduction loop: k=k+1

[Step 4] Chemotactic loop: j=j+1

For bacteria i=1,2,3,……S take a chemotactic step as follows::

Calculate the current objective function value as follows:

J(i, j, k, l)= J(i, j, k, l)+Jcc

( ),,(),,,,( lkjPlkjiθ )(i.e. add on the cell-cell factor if you have

chosen swarming whose formula is discussed previously).

(a) To find out more favorable value compared to this save the value as

last

J =J(i, j, k, l).

(b)Tumble: To simulate tumble step generate a random vector ∆ (i) pℜ∈ with each element

has to be randomly chosen within the optimization domain(m

∆ (i),m=1,2,….p).

(c)Move: movement of the bacteria can be represented as:

( )( 1, , ) ( , , ) ( )

( ) ( )

i i

T

ij k l j k l C i

i iθ θ

∆+ = +

∆ ∆

This results in the movement of step size C(i) unit in the direction of tumble for bacteria i.

(d)Compute the function value at new point

J(i, j+1, k, l)=J(i, j, k, l)+Jcc

( iθ (j+1, k, l),P(i, j, k, l)).

(e)Swim

i) Let m=0(count for swim length)

ii) while m < NS have not gone too far

let m=m+1.

If J(i, j+1, k, l) < Jlast

(got more favorable value) then save this new value

Jlast

= J(i, j+1, k, l) and let

12

iθ (j+1, k ,l)= iθ (j, k, l) + C(i) ( )

( ) ( )T

i

i i

∆

∆ ∆

And use this ( 1, , )i j k lθ + to compute the new J(i, j+1, k, l) as we have done it in (d).

Else let m= SN this is the end of while statement..

(iii) Go to the loop of bacteria (i+1) if i≠ S

[Step 5] if j < CN go to step 4 , in this case continue chemotaxis since the life of bacteria is not

over.

[Step 6] Reproduction:

[a] For the given k and l ,and for each i=1,2,………….S, let 1

1

( , , , )CN

i

health

j

J J i j k l+

=

= ∑

is the measure of all the nutrients it has taken during it’s life time and it is also measure of how

much successful it was at avoiding noxious substance. Sort bacteria and chemotactic parameters

C(i) in order of ascending cost healthJ (higher cost means lower health).

[b] The / 2rS S= bacteria with high health value die and rest half go under the

process of reproduction and the offspring thus produced are placed exactly at the same location

as their parent.

[Step 7] If k< reN go to step 3.In this case, we have not reached the number of specified

reproduction step

[Step 8] To perform elimination and dispersal step choose the no of bacteria to be eliminated (to

be decided with the parameter edp ) and initialize them within the optimization domain. If l < edN

go to the reproduction loop again else finish the loop. Select the minimum value obtained in all

the bacteria, this value gives us the minimum value of the function. The flow chart of the above

algorithm is given in appendix 1.

Wavelet Neural Network:

The word wavelet is due to Grossmann and Morlet [16].Wavelets are a class of function used

to localize a given function in both space and scaling

(http://mathworld.wolfram.com/wavelet.html).They have advantages over traditional Fourier

methods in analyzing physical situations where the signal contains discontinuities and sharp

spikes. Wavelets were developed independently in the fields of mathematics, quantum physics,

13

electrical engineering and seismic geology .Interchanges between these fields during the last

few years have led to many new wavelet application such as image compression, radar and

earthquake prediction.

A family of wavelet can be constructed from a function ( )xψ sometimes it is known as

“mother wavelet” .which is confined in a finite interval ”Daughter Wavelets” , ( )a b xΨ are then

formed by translation (b) and dilation (a). Wavelets are specially useful for compressing image

data. An individual wavelet is defined by

, 1/2( ) | | ( )a b x bx

aψ α − −

= Ψ

In case of non uniformly distributed training data, an efficient way of solving this problem is

by learning at multiple resolutions. Wavelets in addition to forming an orthogonal basis are

capable of explicitly representing the behavior of a function at various resolutions of input

variables. Consequently a wavelet network is first trained to learn the mapping at the coarsest

resolution level. In subsequent stages ,the network is trained to incorporate elements of

mapping at higher and higher resolutions .Such hierarchical ,multi resolution has many

attractive features for solving engineering problems, resulting in a more meaningful

interpretation of the resulting mapping and more efficient training and adaptation of the

network compared to conventional methods. The wavelet theory provides useful guidelines for

the construction and initialization of networks and consequently, the training times are

significantly reduced.(http://www.ncl.ac.uk/pat/neural-networks.html).

Wavelet networks employ activation functions that are dilated and translated versions of a

single function : dR RΨ → ,where d is the input dimension(Zhang et al.[40]).This function

called the ‘mother wavelet’ is localized both in the space and frequency domains (Becerra et al.

[5]).Based on wavelet theory ,the wavelet neural network (WNN) was proposed as a universal

tool for functional approximation ,which shows surprising effectiveness in solving the

conventional problem of poor convergence or even divergence encountered in other kind of

neural networks It can dramatically increase convergence speed (Zhang et al.[42]).

The WNN network is consist of three layers namely input layer, hidden layer and output layer

.Each layer is fully connected to the nodes in the next layer. No of input and output node

depend on the no of input and output present in the problem. The no of hidden node can be any

no from 3 to 15.WNN is implemented here with the Gaussian wavelet function .

14

X (1 )

X (2 )

IN P U T L A Y E R

W ti

H ID D E N L AY E R

W t

O U TP U T 1

O U TP U T 2

O u tp u t1 )

F ig 1 :W A V E L E T N E U R A L N E T W O R K

O U T P U T L A Y E R

O U T P U T 3

The training algorithm for a WNN is as follows (Zhang et al.[42]):

1) Select the no of hidden nodes required .Initialize the dilation and translation parameters

for the connection between the input and hidden layers and also for the connection

between the hidden and the output layers. It should be kept in mind that the random

value should be limited in the interval (this gives the small error value and algorithm m

converges early).

2) The output value of the sample K

V ,K=1,2,……..,np,where np is the number of sample

is calculated with the following formula :

15

1

1

( )

n in

i j k i jn h ni

K j

j j

w x b

V W fa

=

=

−

=∑

∑ (1)

where nin is the number of input nodes and nhn is the number of hidden nodes and

k=1,2,……,np.

In (2) when f(t) is taken as Morlet mother wavelet is has the following form

2( ) cos(1.75 )exp( / 2)f t t t= − (2)

And when taken as Gaussian wavelet it becomes

2( ) exp( )f t t= − (3)

(3) reduce the error of prediction by adjustingj

W ,ij

w , ,j j

a b using , , ,j ij j j

W w a b∆ ∆ ∆ ∆ (see

formulas(4)-(7)).In the WNN, the gradient descend algorithm is employed:

( 1) ( ),j j

j

EW t W t

Wη α

∂∆ + = − + ∆

∂ (4)

( 1) ( ),( )

ij ij

ij

Ew t w t

w tη α

∂∆ + = − + ∆

∂ (5)

( 1) ( ),( )

j j

j

Ea t a t

a tη α

∂∆ + = − + ∆

∂ (6)

( 1) ( ),( )

j j

j

Eb t b t

b tη α

∂∆ + = − + ∆

∂ (7)

where the error function can be taken as

2

1

1 ( )

2

np

kk

k

E V V∧

=

= −∑ , (8)

Where η and α are the learning and the momentum rates respectively.

4) Return to step (2) the process is continued until E satisfies the given error criteria, and

the whole training of the WNN is completed.

Some problem exists in WNN such as slow convergence, searching space tapping in local

minima and oscillation (Pan et al 2008). We propose BFTWNN to resolve these problems.

3. Meta heuristic used to train WNN

3.1 Threshold accepting trained WNN(TAWNN)

16

Threshold accepting algorithm, originally proposed by Dueck and Scheur[14] is a faster variant

of the original simulated annealing algorithm wherein acceptance of the new move or solution

is determined by a deterministic criterion rather than a probabilistic one.

3.2 Bacterial Foraging Technology:

BFT is a novel approach in evolutionary algorithm .It was proposed by Passino[20].It is

population based optimization algorithm and is completely based on the foraging method of E.

coli bacterium. In a population of solution within an n dimensional search space ,a fixed

number of solution is initialized randomly ,then evolved over time to explore the search space

and to locate the minima of the objective function .Inside a generation, new solution are

generated by adding a fixed step size in each solution(chemotactic step).The half of the

solution that are better than other half are selected in each reproduction step and finally

elimination dispersion step is taken to disperse the bacteria to a random location.

4. Training of WNN with BFT algorithm

Application of BFT in training WNN basically modifies steps (3) and (4) of the WNN training

algorithm for WNN described in Section 2.Output of WNN is a function of weights W(weights

from input layer to hidden layer), w(weights from hidden layer to output layer) , dilation

parameters D, translation parameters T and input values X i.e. Y=f(X,θ ) ,where Y is the

output values vector and θ =(D,T,W,w). During training phase both the input vector X and

output vector Y are known and synaptic weights W and w, dilation parameters D and

translation parameters T are predicted and adapted by minimizing network error E to obtain

proper relationship from X to Y. In BFTWNN, the elements involved in the vectors D,T,W

and w are the decision variables.

Vector θ consists of

(i) Weights values from input nodes to hidden nodes W={ij

W ,i=1,2,………,nin,

where nin=number of input nodes j=1,2,…….,nhn ,where the nhn is the number of hidden

nodes}

ii) Weight values from hidden nodes to output nodes w={jk

w ,j=1,2,…..,nhn and

k=1,2,……,non, where non=number of output nodes }

iii) Dilation parameters D=( 1 2, ,....,nhn

d d d )

17

iv) Translation parameters T=( 1 2, ,.....,nhn

t t t )

A population P in each generation consist of M such θ vectors where M is the size of

population as below:

P= { 1 2 3, , .............M

θ θ θ θ } (9)

The initial population is randomly initialized using the user specified lower and upper bounds

for weights, dilation and translation parameters as follows:

m in m a x m in( 0 ,1) * ( )i i i i

r a n dθ θ θ θ= + −

(for faster convergence initial values should lie between 0 and 1).The initial NRMSE

value(represented by function J) is stored for these initial values

( , , , ) ( , , , ) ( ( , , ), ( , , ))i

CCJ i j k l J i j k l J j k l P j k lθ= + (10)

Chemotaxis is basically a search step, which with tumble and swim step directs the search

towards potential areas of optimal solution .In the tumble step we basically choose a unit

random vector (choosing the random values for all weights, dilation and translation parameters

and dividing them by the squared sum of all these).This vector basically determines the

direction in which bacteria have to proceed, this should be kept in mind that NRMSE value

should decrease after taking this step. The chemotactic step can be represented in the equation

as follows:

( )( , 1, , ) ( , , , ) ( )

( ) ( )T

ii j k l i j k l C i

i iθ θ

∆+ = +

∆ ∆ (11)

compute ( , 1, , )J i j k l+ :

( , 1, , ) ( , , , ) ( ( 1, , ), ( 1, , ))i

CCJ i j k l J i j k l J j k l P j k lθ+ = + + + (12)

where θ is the set of all vectors(i.e. consisting of all the weights ,dilation and translation

parameters), C(i) is the size of the step taken and i∆ is the vector in the random direction. It

will be containing random values for weights from input to hidden, from hidden to output,

dilation and translation parameter) i is the bacterium index, j, is the chemotactic index, k is

reproduction index, l is the elimination dispersal index. New NRMSE value is calculated and

stored. After the tumble step a swim step is taken in the direction of tumble step but until a

maximum length of swim , after that a tumble step must be taken.

18

Next step is the reproduction step ,in this step the health value of the bacteria is calculated as

follows, for the given reproduction step counter and elimination step counter it is calculated as

follows:

(a) For each i=1,2,………………,P let

1

1

( , , , )CN

i

health

j

J J i j k l+

=

= ∑ (11)

be the health of bacteria .sort bacteria in the increasing value of(health

J ).

(b) The / 2r

P P= bacteria with the highest health

J value die and the other r

P bacteria with the

best value split and the copies that are formed are placed at exactly the same location as their

parent.

Next step is the elimination and dispersal step to simulate this step we take a bacteria at the

random (chosen with probability ed

p and are randomly dispersed to the random location.)

Chemotactic step have to be repeated up to C

N times for each bacteria ,the reproduction step

has to be repeated r

N times and elimination dispersal step has to be completed for ed

N times

.After performing all these step we get M solution that is retained by each bacteria, we select

those value which have the minimum NRMSE value. The set of values are our optimum

weights, dilation and translation parameter value these values are tested on the test data. We

can also set the other condition that if the objective function value in two consecutive steps is

less than a predefined value then the algorithm gets terminated.

5. Bankruptcy Predictions:

Bankruptcy prediction has been a subject of formal analysis since at least 1932, when Fitz[15]

Patrick published a study of 20 pairs of firms, one failed and one surviving, matched by date,

size and industry, in The Certified Public Accountant. He did not perform statistical analysis as

19

is now common, but he thoughtfully interpreted the ratios and trends in the ratios. His

interpretation was effectively a complex, multiple variable analysis. The prediction of

bankruptcy has been subject of extensive research since late 1960.

In 1967, William Beaver[4] applied t-tests to evaluate the importance of individual accounting

ratios within a similar pair-matched sample. In 1968, in the first formal multiple variable

analysis. Edward I. Altman [2] applied multiple discriminated analysis a pair-matched

sample. One of the most prominent early models of bankruptcy prediction is the Z-Score

Financial Analysis Tool which is still applied today.

The banks are mostly monitored by regulators who conduct on-site examination on bank’s

premises every 12-18 months, as stipulated by the Federal Deposit Insurance Corporation

improvement act of 1991.Regulators indicate the safety and soundness of the institution using

a six part rating system .This rating ,referred to as the CAMELS rating ,evaluates banks

according to their basic functional areas: Capital adequacy, Asset quality, Management

Expertise, Earning Strength, Liquidity, and Sensitivity to market risk. While CAMELS ratings

clearly provide regulators with important information, Cole and Gunther[12]reported that these

CAMELS rating decay rapidly.

Many statistical techniques such as regression analysis, logistic regression etc. have been used

to solve the problem of bankruptcy prediction .These technique make use of the company’s

financial data to predict its financial state to predict it’s financial state. Bankruptcy prediction

problem can also be solved using various other type of classifiers such as case based reasoning

(Jo,Han,&Lee [18]),rough sets (Mckee [24])and data envelopment analysis(Cielen,Peters,&

Vanhoof [11]) to mention a few. Recently Ravi Kumar and Ravi[30] proposed a fuzzy rule

based classifier for bankruptcy prediction .They reported that fuzzy rule based classifier

outperformed well known technique BPNN in the case of US bank’s data sets. Cheng Chen &

Fu[10] combined RBF network with logit analysis learning to predict financial distress. They

compared the proposed technique with logit analysis and Back Propagation Neural Network

and found that their method is superior to both techniques. Ravi Kumar and Ravi [29] proposed

an ensemble classifier using simple majority voting scheme for the bankruptcy prediction

problem based on a host of intelligent technique such as ANFIS,RBF

,SORBF1,SORBF2,Orthogonal RBF and BPNN. They reported that ANFIS, SORBF2, BPNN

are most prominent as they appeared in the best ensemble classifier combinations. Ravi, Ravi

Kumar, Ravi Srinivas and Kasabov[34]proposed a semi online training algorithm for the radial

20

basis function neural networks (SORBF) and applied it to bankruptcy prediction in banks.

Semi online RBFN without linear terms performed better than techniques such as ANFIS,

BPNN, RBF and Orthogonal RBF. In another work Ravi Kumar and Ravi conducted a

comprehensive review of all the works reported using statistical and intelligent techniques to

solve the problem of bankruptcy prediction in banks and firms during 1968-2005. it compares

the techniques in the terms of prediction accuracy ,data sources ,time line of each study

wherever applicable. Recently Pramodh and Ravi[28] employed modified great deluge

algorithm to train an auto associative neural network and applied it to bankruptcy prediction

.Further Ravi,Kurnaiwan,Peter Nwee Kok Thai & Ravi Kuamr[32]developed a novel soft

computing system for bank performance prediction based on BPNN, RBF,CART,PNN,FRBC,

and PCA based hybrid techniques.

Most recently to solve bankruptcy prediction problems Ravi and Pramodh[28]proposed a

threshold accepting based training algorithm for a novel principal component neural

network(PCNN),without a formal hidden layer .They employed PCNN for bankruptcy

prediction problems and reported that PCNN outperformed BPNN,TANN, PCA-BPNN and

PCA-TANN in terms of area under receiver characteristic curve (AUC) criterion. in BPNN and

PCA-TANN,PCA is used as a preprocessor to BPNN and TANN respectively.

6. Result and discussion

The data set analyzed by us in this work are three different datasets viz. Turksih banks ,Spanish

banks and US banks datasets and three other benchmark datasets viz, Iris data, Wine data and

Wisconsin breast cancer data. Turkish bank’s dataset is obtained from Canbas, Caubak

&Kilic[9] and is available at (http://www.tbb.org.tr/english/bulten/yillik/2000/ratios.xls).

Banks association of turkey published 49 financial ratios of of previous year for predicting the

health of the bank in present year. However Canbas et al.[9] chose only 12 ratios as the early

warning indicators that have the discriminating ability (i.e. significant level is < 5%) for

healthy and failed banks one year in advance .Among these variable,12th variable has some

missing values meaning that data for some of the banks are not given so we filled those

missing values with the mean value of the variable following the general approach in data

mining. The financial ratios ,which are considered as predictor variable are presented at the end

of the paper in table 1.This datasets contains 40 banks where 22 banks went bankrupt and 18

banks were healthy .The Spanish bank’s data is obtained from Olmeda and Fernandez

21

[27]..The ratios used for the failed banks were taken from the last financial statement before

the bankruptcy was declared and the data of non failed banks was taken from 1982 statements.

This datasets contains 66 banks where 37 banks went bankrupt and 29 healthy banks. The US

bank dataset is obtained from Olmeda and Fernandez[27] the financial ratios used by them are

presented in table 1.they obtained the data of 129 banks form the Moody’s industrial manual

,where banks went bankrupt during 1975-1982.this 129 us banks dataset contains 65 went

bankrupt and 64 healthy banks. Again, the financial ratios used by them are presented in

table1.the benchmark datasets are taken from UCI repository (http://archives.ics.uci.edu/ml).

The parameters used for WNN were number of hidden node. Parameters used for BFTWNN

are number of hidden nodes, no of chemotactic step, no of reproduction step, no of elimination

dispersion step, no of bacteria, no of swim step, step size and λ (if you are using dynamic step

size(quotation needed).No of bacteria is taken as between 50 to 100,the no of chemotactic step

is taken between 30 to 50,no of reproduction step is taken between 20-40 and no of elimination

step is taken between 4-10,no of swim step is to be taken as 20-60.The λ value has to be taken

as 400.All are flexible parameters and can be decreased in order to achieve faster convergence.

The no of hidden nodes is taken in the range of 3-15 depending on the no of input nodes for all

the three algorithms.

All the datasets are analyzed with WNN,TAWNN and BFTWNN using 10 fold cross

validation .The average accuracy over all the folds are computed for the six datasets.

Table 1. Financial ratios of the datasets.

S. No. Predictor variable name

Turkish banks’ data

1 Interest expenses/average profitable assets

2 Interest expenses/average non-profitable assets

3 (Share holders’ equity + total income)/(deposits + non-deposit funds)

22

4 Interest income/interest expenses

5 (Share holders’ equity + total income)/total assets

6 (Share holders’ equity + total income)/(total assets + contingencies & commitments)

7 Networking capital/total assets

8 (Salary and employees’ benefits + reserve for retirement)/no. of personnel

9 Liquid assets/(deposits + non-deposit funds)

10 Interest expenses/total expenses

11 Liquid assets/total assets

12 Standard capital ratio

Spanish banks’ data

1 Current assets/total assets

2 Current assets-cash/total assets

3 Current assets/loans

4 Reserves/loans

5 Net income/total assets

6 Net income/total equity capital

7 Net income/loans

8 Cost of sales/sales

9 Cash flow/loans

US banks’ data

1 Working capital/total assets

2 Retained earnings/total assets

3 Earnings before interest and taxes/total assets

4 Market value of equity/total assets

5 Sales/total assets

The average sensitivities and specificities are computed for datasets with two class problems

the result for bankruptcy datasets are presented in table 2.It is observed that BFTWNN

surpassed other algorithm with much better accuracy

TABLE 2

Average result for 10-fold cross validation for other benchmark datasets with all features:-

BFTWNN (%) WNN (%) TAWNN (%) DEWNN

Iris 95.33 94.67 95.99 97.99

Wine 95.6 91.76 92.8 97.6

WBC 97.4 95.29 95.43 97.05

23

TABLE 3

Average results for ten fold cross validation for bankruptcy datasets for specified features:-

BFTWNN(%) DEWNN(%) WNN(%) TAWNN(%)

Turkish Average 95 95 95 100

Sensitivity 97.5 100 100 100

Specificity 97.5 95 95 100

AUC 9750 9750 9750 10000

Spanish Average 88.33 89.99 86.67 88.33

Sensitivity 91.66 91.66 89.67 79.66

Specificity 86.5 93 81 90.5

AUC 8908 9233 8533 8508

US Average 91.47 93.33 85.83 90.83

Sensitivity 88.9 97.323 85.82 90.46

Specificity 85.5 89.78 87.5 91.54

AUC 8720 9355.15 8666 9100

7. Conclusion:-

In this study BFTWNN is developed and compared with TAWNN and the original WNN on

benchmark datasets viz. Iris datasets, Wine datasets and Wisconsin Breast Cancer datasets as

well as bankruptcy datasets viz. Turkish bank datasets, Spanish bank datasets and the result

indicate that BFTWNN can be a very effective soft computing tool for classification problems.

The result indicates that BFTWNN outperformed all other technology in terms of benchmark

datasets and yielded a comparable result in bank datasets. Hence the present research

concludes that training WNN with Bacterial Foraging Technology solves classification

problem with a very good accuracy.

24

APPENDIX 1

FLOW CHART OF bacterial foraging technology

Start

Initialize all variables .set all

loop-counters and bacterium index is

equal to 0

Increase elimination

dispersion loop counter l=l+1

l < Nedl<NedNO

Increase reproduction loop

counter k=k+1

k<Nre ?

Perform elimijnation

dispersal(for i=1,2....S with

probability ped wlim9inate and

disperse one to a random location

Increase chemotactic loop counter j=j+1

j <Nc ?

Perform reproduction(by killing the worse

half of the population with

higher cumulative health and

splitting the better half into two

YES

STOP

NO

YES

NO

YES

Y

X

bacterial foraging trained wavelet neural networks ......1 bacterial foraging trained wavelet neural...

Documents