committee neural networks with fuzzy genetic algorithm.pdf

7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf

1/7

Committee neural networks with fuzzy genetic algorithm

S.A. Jafari a,, S. Mashohor a, M. Jalali Varnamkhasti b

a Computer and Communication Systems, Faculty of Engineering, University Putra Malaysia, 43400, Serdang, Selangor, Malaysiab Laboratory of Applied and Computational Statistic, Institute for Mathematical Research, UPM, 43400, Serdang, Selangor, Malaysia

a b s t r a c ta r t i c l e i n f o

Article history:

Received 17 October 2009

Accepted 10 January 2011

Available online 28 January 2011

Keywords:

back propagation neural network

committee neural network

fuzzy genetic algorithm

reservoir properties

Combining numerous appropriate experts can improve the generalization performance of the group when

compared to a single network alone. There are different ways of combining the intelligent systems' outputs in

the combiner in the committee neural network, such as simple averaging, gating network, stacking, support

vector machine, and genetic algorithm. Premature convergence is a classical problem in nding optimal

solution in genetic algorithms. In this paper, we propose a new technique for choosing the female

chromosome during sexual selection to avoid the premature convergence in a genetic algorithm. A bi-linear

allocation lifetime approach is used to label the chromosomes based on their tness value, which will then be

used to characterize the diversity of the population. The label of the selected male chromosome and the

population diversity of the previous generation are then applied within a set of fuzzy rules to select a suitable

female chromosome for recombination. Finally, we use fuzzy genetic algorithm methods for combining the

output of experts to predict a reservoir parameter in petroleum industry. The results show that the proposed

method (fuzzy genetic algorithm) gives the smallest error and highest correlation coefcient compared to ve

members and genetic algorithm and produces signicant information on the reliability of the permeability

predictions.

2011 Elsevier B.V. All rights reserved.

1. Introduction

There are some reasons for distributing a learning task among a

number of individual networks. The main reason is due to improving

the generalization ability, because the generalization of individual

networks is not unique. The combination of some Articial Neural

Network (ANN) when they do the same task is called as the

ensemble of neural network or committee of neural network. When

the networks are different it is called a committee of machine. In

ensemble methods, the ensemble candidates are different. There are a

number of methods to create different individual training data, the

initial condition, the topology of nets, and the training algorithms.

After selecting individuals and training them, their generated results

will be combined by some methods. The committee machine

structure can be viewed in Fig. 1. In the committee machine, the

expectation is that difference experts converge to different local

minima on the error surface, and the overall output improved the

performance, (Wolpert, 1992; Efron and Tibshirani, 1993; Rezaee,

2001). The mean square error (MSE) between individual output and

expectation output (target) can be expressed in terms of the bias

squared plus the variance, (Haykin, 1999). The MSE equation makes it

clear that we can reduce either the bias or the variance to reduce the

neural network error. Unfortunately, it is found that for theconcerned

individual ANN, the bias is reduced at the cost of a large variance.However, the variance can be reduced by an ensemble of ANNs. From

the MSE equation,Naftaly et al. (1997)obtained two conclusions:

(1) The bias of the ensemble averaged function is exactly the same

as that of the function connected to a single NN.

(2) The variance of the ensemble averaged function is less than

that of the function connected to a single NN.

These theoretical ndings indicate that ensembles of ANNs can

easily reduce the variance with less cost from the bias. Therefore, an

effective approach is to select or create a set of nets or experts that

show high variance but low bias, because with combining, we can

reduce the variance. Several methods have beenemployed for creating

committee members. Generally, these methods can be divided into

three categories:

(1) Method to select diverse training data sets from the original

source data

(2) Method to create different experts or individual neural

network

(3) Method to combine these individuals and their results

2. Some methods for constructing committee member

In this section, we introduce some methods for committee

member construction as mentioned in part 1. There are some

approaches that have been used for selecting training data set by

Journal of Petroleum Science and Engineering 76 (2011) 217223

Corresponding author. Tel.: +60 124422445.

E-mail addresses:[email protected](S.A. Jafari),[email protected]

(S. Mashohor),[email protected](M.J. Varnamkhasti).

0920-4105/$ see front matter 2011 Elsevier B.V. All rights reserved.

doi:10.1016/j.petrol.2011.01.006

Contents lists available at ScienceDirect

Journal of Petroleum Science and Engineering

j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / p e t r o l
http://dx.doi.org/10.1016/j.petrol.2011.01.006http://dx.doi.org/10.1016/j.petrol.2011.01.006http://dx.doi.org/10.1016/j.petrol.2011.01.006mailto:[email protected]:[email protected]:[email protected]://dx.doi.org/10.1016/j.petrol.2011.01.006http://www.sciencedirect.com/science/journal/09204105http://www.sciencedirect.com/science/journal/09204105http://dx.doi.org/10.1016/j.petrol.2011.01.006mailto:[email protected]:[email protected]:[email protected]://dx.doi.org/10.1016/j.petrol.2011.01.006


2/7

varying the source data sets. Bagging, noise injection, cross-validation,

stacking and boosting are the most common techniques. Other

methods have been used by a researcher to construct committee

members are: Fuzzy Logic (FL) with a different fuzzy inference

system, Genetic Algorithm (GA), Neuro-Fuzzy, empirical formula and

etc.Kadkhodaie-Ilkhchi et al. (2009a), have used the neural network,

fuzzy logic and fuzzy neural network as a committee member.Chen

and Lin (2006) have used three empirical formulas as committee

members. Kadkhodaie-Ilkhchi (2009b), Rezaee et al. (2009) have

used a back-propagation neural network with the different training

algorithm for the construction committee neural network. In this

paper, we have used ve signicant training algorithms in articial

neural network as committee members. They were Levenberg

Marquardt (LM), Bayesian Regularization (BR), One Step Secant

(OSS), Resilient Back Propagation (RP), and Scaled Conjugate Gradient

(SCG).The followingis a brief description of some important methods to

create committee members.

2.1. Bagging (Breiman, 1996)

One of the important methods of manipulating the data set and

creates M training sets is bootstrap aggregation or bagging. The

basic idea of bagging is to generate a collection of experts, such

that every expert uses bootstrap training data set. Given a data set,X=(x1,x2,...,xn), bootstrap sampling means to create N new data set

X1,X2,X3,...,XNsuch that every Xi is generated by randomly picking n

data point xi of X. It is clear, in creating Xi, some xi of X may be

repeated and some xi may be ignored. In bagging, we repeat this

learning algorithm to create M different training sets for M experts.

The bagging method is designed to reduce the error variance, and it is

very efcient in constructing a set of training data when the source

data size is small.

2.2. Noise injection (Raviv and Intrator, 1996)

As mentioned before, simple bootstrap generates several training

data using the source data, all with the same size. Efron and Tibshirani

(1993), noted that, bootstrap can also be viewed as a method forsimulating noise inherent in thedata andthus, increase effectively the

number of training patterns. Raviv and Intrator (1996), presented

another algorithm in a bootstrap, which is Bootstrap Ensemble with

Noise (BEN).In this method, a variable amountof noise is added to the

input data before using the bootstrap sampling to ensemble training

sets. This method can effectively reduce the variance, since the

injection of noise increases the independence between the different

training sets derived from the sourcedata sets. Bhatt and Helle (2002)

have used noise injection for the construction committee members.

2.3. Cross-validation (Krogh and Vedelsby, 1995)

In this method the available data set is partitioned into M disjoint

andequalsubsets.Afterthat we select oneof these subsets as a test set

and the (M-1) remaining data set as the training data. After M time

repeating, we have M numbers of overlapping training sets and M

independent test sets. Since the training sets are different, then the

generated errors after training are expected to fall in different local

error minima and therefore lead to different results. The performance

of experts is measured on the corresponding test data set. Breiman,

Friedman, Olshen and Stone, use cross-validation to prune classica-

tion tree algorithms.

2.4. Stacking (Wolpert, 1992)

The rst part of the stacking method is similar to the cross-

validation method. As mentioned above,there is an M training set and

an M test set. After that we use the M training sets to train two

generalizersG1,G2and then the M test set is put intoG1andG2, (these

outputs will be used as second space generalizer inputs). The output

ofG1 and G2andtarget value, (g1i,g2i,yi) will beusedas the training set

of generalizerG as a second space generalizer.

2.5. Boosting byltering (Schapire, 1990), AdaBoost (Freundand Schapire)

In this method, there are three experts. The rst expert is trained

with the M training data of the source training data set and the result

of the rst expert will be applied to the second expert. After that, the

second expert will be trained with this data set. After training the

second expert, the training data of the sourcedata will be passedfrom

the rst and second experts. Finally the third expert will be trained

onlyon the data set inwhich the output of the rstand secondexperts

is disagreed. That means, if there are disagreements between the rst

and second experts on a certain data, this data will be passed to the

third expert. The nal result is related to the outputs of the three

experts. Freund and Schapire (1995); Drucker et al. (1994), have

shown that the boosting algorithm is very effective in many

experiences. Another method of boosting is adaptive boosting. In

this method, the training data will be selected with their probability.

For every data, the predicted value is close to the target value and the

probability to choose this data is low, otherwise the probability is

high. This methodgivesmore chances to such data forretraining.For aclassication problem, we canuse majority votingand fora regression

problem the result with lowest error rate is selected. AdaBoost is

sensitive to noisy data and outliers, but it is less sensitive to the over

tting for most learning algorithms.

3. Combination methods

Thelast stage of designCommittee Machine (CM) is thecombination

of the expert outputs. Many investigations have been done to nd the

combining methods to combine the expert outputs and produce the

nal outputs. In this section, we have introduced some traditional

combining methods in the CM. Some of them are suitable for the

classier and some of them performed well in regression.

3.1. Simple averaging (Lincoln and Skrzypek, 1990)

One of the most frequently used combination methods is simple

averaging. In this method after training the committee members, the

nal output canbe obtained by averaging the output of thecommittee

members. It is easy to illustrate by Cauchy's inequality which the

Mean Square Error (MSE) for committee machine with the simple

averaging method is less or equal than the average of MSE for every

expert. This method is more useful when the variances of the

ensemble members are different, because the simple averaging can

reduce the variance of the nets. The disadvantage of simple averaging

is the equal weight for every committee member, i.e. there is no

difference between the weights of two committee members with low

and high generalizations.

NN 1

NN 2Input

X(n)

Output

Y(n)

NN k

Combin

)(

)(

)(

2

1

ny

ny

ny

k

Fig. 1.Committee neural network with k members.

218 S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223


3/7

3.2. Weighted averaging (Jacobs, 1995)

In this method, every committee member has a suitable weight

related to their ability to generalization. InJacobs (1995) the researcher

introduced a gating method to determine the weight of every expert.

The authors Opitz andShavlik (1996)have usedGenetic Algorithm (GA)

to determinethe weight of each member. To obtain theoptimal weights

for combining with the GA algorithm, the tness function is dened as

below:

MSEGA = n

i = 1

1

n w1y1i + w2y2i + :::+ wkykiTi

2;

k

i = 1wi = 1 1

where, y1i is the output of the rst network on the ith input or ith

training pattern, w i is the weight of the ith member, Ti is the target

value of thei-thinput, andn is the number of training data.

3.3. Majority voting (Hansen and Salamon, 1990)

This combination method is most popular for classication

problems. If more than half of the individuals vote for a prediction,

majority votingwill selectthis predictionto be thenal output.Majority

voting ignores the fact that some networks that lie in a minority

sometimescan produce the correct results. At this stage of combination,

it ignores the existenceof diversity that is the motivation forensembles.

3.4. Ranking (Ho et al., 1994; Al-Ghoneim and Kumar, 1996)

This method uses experimental results obtained by a set of experts

on a set of dataset to generate a ranking of those experts (each expert

has a rank related with an input dataset). After that the results of the

ranks of each expert will be calculated by some methods such as

average rank, success rate ratio, and signicant wins to generate nal

ranking for experts. The nal rank can be used to select one or more

suitable experts for a test (unseen) data (Brazdil and Soares, 2000).

There are no unique criteria on the selection of the mentioned

combination methods. The choice mainly depends on the characteristic

of the particular application that we have in hand, e.g. the nature of theapplication (classier or regression), the size and quality of thetraining

data and the generated errors on the region of the input space. Using

thesame combination methodon an ensemble fora regression problem

may generate good results. However, it may not work on a classication

problem and vice versa. Much work has been done to introduce

combining method in ensemble approaches. Major contribution in

ranking is as the weighted majority voting (Kuncheva, 2004), decision

templates (Kuncheva et al., 2001), naive Bayesian fusion (Xu et al.,

1992), Dumpster Shafer combination (Ahmadzadeh and Petrou, 2003)

and Fuzzy integral (Cho and Kim, 1995).

4. Fuzzy genetic algorithm (FGA) for combining

Genetic Algorithm (GA) is a search optimization technique thatmimics some of the processes of natural selection and evolution. In

optimization, when a GA fails tond the global optimum, the problem

is often credited to premature convergence, which means that the

sampling process converges on a local optimum rather than the global

optimum. Sexual selection by means of female preferences has

promoted the evolution of complex male ornaments in many animal

groups. A sex-determination system is a biological system that

determines the expansion of sexual characteristics in an organism.

Most sexual organisms have two sexes. In many cases, sex determina-

tion is genetic: males and females have different allelesor even different

genes that state their sexual morphology. In a classical GA, chromo-

somes reproduce asexually: any two chromosomes may be parents in

crossover. Gender division and sexual selection inspired a model of

gendered GA in which crossover takes place only between chromo-

somes of an opposite sex. In this study, a relation between the age and

tness as in biological systems affecting the selection procedure is

proposed. A bi-linear allocation lifetime approach is used to label the

chromosomes based on their tness value, which will then be used to

characterize the diversityof the population. Inspired by the non-genetic

sex-determination system that exists in some species of reptiles,

including alligators and some turtles where sex is determined by the

temperature at which the egg is incubated, we divided the population

into two groups, male and female, so that the male and female can beselected in an alternate way. In each generation, the layout of the

selection of male andfemale is different. Duringthe sexual selection, the

male chromosome is selected randomly. The label of the selected male

chromosome and the population diversity of the previous generation

are then applied within a set of fuzzy rules to select a suitable female

chromosome (Jalali and Lee, 2009). Fuzzy systems are encountered in

numerous areas of application. Fuzzy rules, for example, viewed as a

generic mechanism of grainy knowledge representation, are positioned

in the center of the knowledge-based systems. A fuzzy IF-THEN rule

consists of an IF part (antecedent) and a THEN part (consequent).

The antecedent is a combination of terms, whereas the consequent is

exactly oneterm. In theantecedent, thetermscan be combined by using

fuzzy conjunction, disjunction and negation. A term is an expression

of the form: X=T, where X is a linguistic variable and T is one of its

linguistic terms. In this paper, we use a linguistic variable age for

chromosomes.Fig. 2describes the linguistic variable age where Infant,

Teenager, Adult and Elderly are the linguistic values.

The system applied in our study uses triangular membership

functions, the (minimum) intersection operator and correlation-

product inference procedure. Defuzzication of the outputs was

performed using the fuzzy centroid method described by Kosko

(1992). To nd the membership function, we use the tness value of

each chromosome and the minimum, maximum and average tness

values of the population in each generation. Each chromosome has

its own label determined by the age function. Let

= fifminfavrfmin

; = fifavrfmaxfavr

; = favrfi 2

wherefi=tness value of chromosomei,favr=averagetness value,

fmin=minimum tness value, and fmax=maximum tness value of

population. Then the age function is:

age ci =

L+

n ; 0

+

n ; b0

8>>>:

3

or

age ci =

U L +

n ; 0

U +

n ; b 0

8>>>:4

where, ci=Chromosome i, L =Minimum age, U=maximum age,

n =Population size, =UL

2 ,=

U+ L

2 , and, , are dened in

Eq.(2).

Linguistic variable

Syntactic rules

Linguistic terms

Age

Infant Teenager Adult Elder ly

Fig. 2.The linguistic variable

age

.

219S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223


4/7

Eq.(3)is suited for maximization problems which relate to higher

tness values while Eq. (4) is suited for minimization problems which

relate to lower tness values. This idea is inspired by the idea of

the lifetime proposed in Arabas et al. (1994). The fuzzication

interface denes the possibilities of the four linguistic values for each

chromosome: {Infant, Teenager, Adult, and Elderly}. These values

determine the degree of truth for each rule premise. This computation

takes into account all chromosomes in each generation and relies on

the triangular membership functions shown inFig. 3, withL =2 and

U=10. A bi-linear allocation lifetime approach proposed in Kosko

(1992) is used to label the chromosomes based on their tness

value which will then be used to characterize the diversity of the

population.

D ci = L+ ; 0+ ; b 0

:

5

Let=label of half of the population, then the population can be

divided into four levels, Very Low, Low, Medium and High diversity as

follows:

PopulationDiversity

=

High L+ tMedium L+ tb L+ t+ 1

Low L+ t+ 1b L+ t+ 2Very Low NL + t+ 2

8>>>:

6

where, t= L+ U

n

is a parameterthat hasa correlation with the

domain of labels in the population and = n

10h i, (where [x] means

nearest integer number to x, for example [2.3]=2 and[2.8]=3). This

computation is performed in every generation and relies on the

triangular membership functions shown in Fig. 4. The inputs are

combined logically using the AND operator to produce output

response values for all expected inputs. A ring strength for each

output membership function is computed. All that remains is to

combine these logical sums in a defuzzication process to produce the

crisp output. The fuzzy outputs for all rules are nally aggregated to

one fuzzy set. To obtain a crisp decision from this fuzzy output, we

have to defuzzify the fuzzy set. Defuzzication of the outputs was

performed using the fuzzy centroid method of the ring behavior

(Kosko, 1992), which may show that some of the rules are

unnecessary. The number of fuzzy rules in its rule base is 16. Table 1

liststhe fuzzy rules for selecting the female chromosome. Although we

can obtain theFage, we may not be able to nd a female chromosome

that has the exact Fage. We will select a female chromosome having

the nearesttness value toFageto be the parent. In case there is more

than one female chromosome which satises the Fage condition, we

will choose a female chromosome with the highest tness value to

be the parent. This technique is called the Complement Method (Jalali

and Lee, 2009).

5. Case study

In this section, we used a data set from oil wells in Iran. First

several crossplots were generated between well log data and core

permeability to nding, which log has a good relationship with

permeability. With this method, we found a logical relationship

between ve inputs including Sonic transit time (DT), Neutron log

(NPHI), Density log (RHOB), Gamma Ray (GA), and True Formation

Resistivity (Rt) and rock permeability (K) as a target respectively. The

total of the data points divided randomly into three parts, sixty

percent for training, twenty percent forvalidation and twenty percent

for test. Five training algorithms of back propagation neural network

are selected as committee members. They were Levenberg Marquardt

(LM), Bayesian Regularization (BR), One Step Secant (OSS), Resilient

Back Propagation (RP), and Scaled Conjugate Gradient (SCG). As

mentioned above we used ve wireline logs as input data and a core

permeability as output data for analysis of our combining methods.

A brief description of this data set is provided here.

Sonic log(DT): The sonic tool measures the time required for the

transmission of an acoustic wave through a unit of formation thickness.

Sonic transit time (DT) is used bothin theporosity determinationand to

compute secondary porosity in carbonate reservoirs (Service, 1999).Neutron log(NPHI): A radioactivity well log is used to determine

formation porosity. The logging tool bombards the formation with

neutrons. When the neutrons strike the hydrogen atoms in water

or oil, gamma rays are released. Since water or oil exists only in

pore spaces, a measurement of the gamma rays indicates formation

porosity. See radioactivity well logging (Service, 1999).

Density log: (RHOB): A special radioactivity log for open-hole

surveying responds to variations in the specic gravity of formations.

It is a contact log (i.e., the logging tool is held against the wall of the

hole). It emits neutrons and then measures the secondary gamma

radiation that is scattered back to the detector in the instrument. The

density log is an excellent porosity-measure device, especially for

shaley sands (Service, 1999).

Gamma Ray(GR): A type of radioactivity well log records naturalradioactivity around the wellbore. Shales generally produce higher

levels of gamma radiation and can be detected and studied with the

gamma ray tool. See radioactivity well logging (Service, 1999).

True formation resistivity (Rt): With reference to log analysis, the

resistivity of the undisturbed formation. It is derived from a resistivity

log that has been corrected as far as possible for all environmental

effects, such as borehole, invasion and surrounding bad effects. Hence,

it is taken as the true resistivity of the undisturbed formation in situ

and is called Rt. With reference to the core analysis, the resistivity

1

00.45 0.650.25

Infant Teenager Adult Elderly

0.85(ci)age

Fig. 3.The age linguistic variable for male and female.

1

04.52.5 6.5

High Medium Low Very Low

8.5D(ci)

Fig. 4.The population diversity linguistic variable.

Table 1

Fuzzy rules for selecting female chromosome.

Male age

(Mage)

Diversity Female age

(Fage)

Male age

(Mage)

Diversity Female age

(Fage)

Infant High Elderly or adul t Adul t High Elderly or adult

Medium Adult or teenager Medium Adult or teenager

Low Teenager or infant Low Teenager or infant

Very low Infant Very low Infant

Teenager High Elderly or adult Elderly High Adult or teenager

Medium Adult or teenager Medium Teenageror infantLow Teenager or infant Low Infant

Very low Infant Very low Infant

Male Female Randomly numbers

Fig. 5.The technique for two points cut in offspring.



5/7


6/7

Mutation is performed in four steps:

(1) A random real number from an interval (0, 1) is generated for

the probability of mutation.

(2) GA considers thisprobability andsome chromosomesare selected.

(3) For eachchromosome that is selected, a random natural number

k, varying from 1 to the number of genes in the chromosome is

generated.

(4) The gene numberk is replaced by another randomly-generatedgene.

A standard GA is used in this experiment with a population size

of 20, and the total length of the chromosomes is 85 bits. Crossover

probability,pc=0.50 and mutation probability,pm=0.02. Each test

function is tested on the GA for 30 times with a maximum of 5000

generations per each run.Fig. 6 shows the variable of female's age

when male's age and diversity are changing. Fig. 7shows when the

male's age chromosomes are increasing, the system for female's age

chromosome is considering decreasing age. Fig. 8 shows that when

the diversity of population is increasing, the system for female's agechromosomes is considering the age decrement. This technique will

Fig. 9.(a

f). Crossplot showing R between core and predicted permeability using ve training algorithms and FGA.

http://localhost/var/www/apps/conversion/tmp/scratch_4/image%20of%20Fig.%E0%B9%80


7/7

maintain the diversity of the population and then GA cannot converge

very soon, and premature convergence will be avoided. Fig. 9(af)

shows the correlation coefcient between the core and predicted

permeabilities from ve training algorithms and FGA methods.

Table 2, shows both MSE and R2 values for overall data points using

ve training algorithms, GA and weighted averaging (FGA). This

table helps us to decide, which combining model is better in its

performance. A good combining scheme should have a higher R2 and

lowest MSE.

6. Conclusion

There are different ways of combining the intelligent system's

outputs in the combiner in the committee neural network, one of

these methods is Genetic Algorithm (GA). The failure to nd good

results when we use GA in a CM is highly due to premature

convergence. The population diversity in GA is an important

parameter on premature convergence. A technique for controlling

the population diversity using Fuzzy rules and sexual selection is

proposed in this paper. In conclusion, the female choiceby Fuzzy logic

is a suitable way for improving the performance of GAs in keeping

with the diversity of the population and premature convergence can

be eliminated. In this paper, we used the FGA methods for combining

the output of experts to the prediction of permeability in the oil

industry. From the simulation results, the correlation coefcient and

MSE forve training algorithms are shown inTable 2.T he R2 and MSE

for GA combining method are 0.8438 and 0.001 respectively, whichare better than all training algorithms. With applying FGA method to

combining, the correlation coefcient and MSE are improved, which

are 0.8523 and 0.00092 respectively.

References

Ahmadzadeh, M.R., Petrou, M., 2003. Use of DempsterShafer theory to combineclassiers which use different class boundaries. Pattern Anal. Appl. 6 (1), 4146.

Al-Ghoneim, K.A., Kumar, B.V., 1996. Combining neural networks using the rankinggure of merit. Proc. SPIE 2760, 213.

Arabas, J., Michalewicz, Z., et al., 1994. GAVaPSa genetic algorithm with varyingpopulation size. Evolutionary Computation, 1994: IEEE World Congress onComputational Intelligence, vol.1, pp. 7378.

Bhatt, A., Helle, H.B., 2002. Committee neural networks for porosity and permeabilityprediction from well logs. Geophys. Prospect. 50 (6), 645660.

Brazdil, P., Soares, C., 2000. A Comparison of Ranking Methods for ClassicationAlgorithm Selection, pp. 6375.

Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123140.Chen, C.-H., Lin, Z.-S., 2006. A committee machine with empirical formulas for

permeability prediction. Comput. Geosci. 32 (4), 485496.Cho, S.-B., Kim, J.H., 1995. An HMM/MLP architecture for sequence recognition. Neural

Comput. 7 (2), 358

369.Drucker, H., Cortes, C., et al., 1994. Boosting and other ensemble methods. NeuralComput. 6 (6), 12891301.

Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. [u.a.] Chapman & Hall,New York. 1993.

Freund, Y., Schapire, R., 1995. A decision-theoretic generalization of on-line learningand an application to boosting. European Conference on Computational LearningTheory, pp. 2337.

Hansen, L.K., Salamon, P., 1990. Neural network ensembles. IEEE Trans. Pattern Anal.Mach. Intell. 12 (10), 9931001.

Haykin, S., 1999. Neural NetworksA Comprehensive Foundation, Upper Saddle River.Prentice-Hall, NJ.

Ho, T.K., Hull, J.J., et al., 1994. Decision combination in multiple classi er systems. IEEETrans. Pattern Anal. Mach. Intell. 16 (1), 6675.

Jacobs, R.A., 1995. Methods for combining experts' probability assessments. NeuralComput. 7 (5), 867888.

Jalali, M., Lee, L.S., 2009. Fuzzy genetic algorithm with sexual selection (FGASS). SecondInt. Conf. and Workshop on Basic and Applied Science, 24 June, Johor Bahru,Malaysia.

Kadkhodaie-Ilkhchi, A., Rahimpour-Bonab, H., et al., 2009a. A committee machinewith intelligent systems for estimation of total organic carbon content frompetrophysical data: an example from Kangan and Dalan reservoirs in South ParsGas Field, Iran. Comput. Geosci. 35 (3), 459474.

Kadkhodaie-Ilkhchi, A., Rezaee, M.R., et al., 2009b. A committee neural network forprediction of normalized oil content from well log data: an example from SouthPars Gas Field, Persian Gulf. J. Petrol. Sci. Eng. 65 (12), 2332.

Kosko, B., 1992. Neural Networks and Fuzzy Systems: A Dynamical Systems Approachto Machine Intelligence. Prentice-Hall, Inc, p. 449.

Krogh, A., Vedelsby, J., 1995. Neural network ensembles, cross validation, and activelearning. Adv. Neural Inf. Process. Syst. 7, 231238.

Kuncheva, L.I., 2004. Classier Ensembles for Changing Environments, pp. 115.Kuncheva, L.I., Bezdek, J.C., et al., 2001. Decision templates for multiple classier fusion:

an experimental comparison. Pattern Recognit. 34 (2), 299314.Lincoln, W., Skrzypek, J., 1990. Synergy of clustering multiple back propagation

networks. Adv. Neural Inf. Process. systems 2, 650657.Naftaly, U., Intrator, N., Horn, D., 1997.Optimal ensembleaveraging of neural networks.

Network 8, 283296.Opitz, D.W., Shavlik, J.W., 1996. Actively searching for an effective neural network

ensemble. Connect. Sci. 8 (3), 337354.Raviv, Y., Intrator, N., 1996. Bootstrapping with noise: an effective regularization

technique. Connect. Sci. 8 (3), 355372.Rezaee, M.R., 2001. Petroleum Geology. Alavi Publications, Tehran, Iran.Schapire, R.E., 1990. The strength of weak learnability. Mach. Learn. 5 (2), 197227.Service, U. o. T. a. A. P. E., 1999. A Dictionary for the Petroleum Industry. Petroleum

Extension Service.Wolpert, D.H., 1992. Stacked generalization. Neural Netw. 5, 241259.Xu, L., Krzyzak, A., et al., 1992. Methods of combining multiple classi ers and their

applications to handwriting recognition. Syst. Man Cybern. IEEE Trans. 22 (3),418435.

Table 2

The comparison of MSE and R2 for test data using ve training algorithm, GA and FGA.

Algorithm R 2 MSE

LM 0.8274 0.0012

BR 0.8239 0.0012

OSS 0.7257 0.0015

RP 0.751 0.0016

SCG 0.7885 0.0015

GA 0.8438 0.001

FGA 0.8523 0.00092

223S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223

committee neural networks with fuzzy genetic algorithm.pdf

Documents