committee neural networks with fuzzy genetic algorithm.pdf
Post on 25-Feb-2018
221 Views
Preview:
TRANSCRIPT
-
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
1/7
Committee neural networks with fuzzy genetic algorithm
S.A. Jafari a,, S. Mashohor a, M. Jalali Varnamkhasti b
a Computer and Communication Systems, Faculty of Engineering, University Putra Malaysia, 43400, Serdang, Selangor, Malaysiab Laboratory of Applied and Computational Statistic, Institute for Mathematical Research, UPM, 43400, Serdang, Selangor, Malaysia
a b s t r a c ta r t i c l e i n f o
Article history:
Received 17 October 2009
Accepted 10 January 2011
Available online 28 January 2011
Keywords:
back propagation neural network
committee neural network
fuzzy genetic algorithm
reservoir properties
Combining numerous appropriate experts can improve the generalization performance of the group when
compared to a single network alone. There are different ways of combining the intelligent systems' outputs in
the combiner in the committee neural network, such as simple averaging, gating network, stacking, support
vector machine, and genetic algorithm. Premature convergence is a classical problem in nding optimal
solution in genetic algorithms. In this paper, we propose a new technique for choosing the female
chromosome during sexual selection to avoid the premature convergence in a genetic algorithm. A bi-linear
allocation lifetime approach is used to label the chromosomes based on their tness value, which will then be
used to characterize the diversity of the population. The label of the selected male chromosome and the
population diversity of the previous generation are then applied within a set of fuzzy rules to select a suitable
female chromosome for recombination. Finally, we use fuzzy genetic algorithm methods for combining the
output of experts to predict a reservoir parameter in petroleum industry. The results show that the proposed
method (fuzzy genetic algorithm) gives the smallest error and highest correlation coefcient compared to ve
members and genetic algorithm and produces signicant information on the reliability of the permeability
predictions.
2011 Elsevier B.V. All rights reserved.
1. Introduction
There are some reasons for distributing a learning task among a
number of individual networks. The main reason is due to improving
the generalization ability, because the generalization of individual
networks is not unique. The combination of some Articial Neural
Network (ANN) when they do the same task is called as the
ensemble of neural network or committee of neural network. When
the networks are different it is called a committee of machine. In
ensemble methods, the ensemble candidates are different. There are a
number of methods to create different individual training data, the
initial condition, the topology of nets, and the training algorithms.
After selecting individuals and training them, their generated results
will be combined by some methods. The committee machine
structure can be viewed in Fig. 1. In the committee machine, the
expectation is that difference experts converge to different local
minima on the error surface, and the overall output improved the
performance, (Wolpert, 1992; Efron and Tibshirani, 1993; Rezaee,
2001). The mean square error (MSE) between individual output and
expectation output (target) can be expressed in terms of the bias
squared plus the variance, (Haykin, 1999). The MSE equation makes it
clear that we can reduce either the bias or the variance to reduce the
neural network error. Unfortunately, it is found that for theconcerned
individual ANN, the bias is reduced at the cost of a large variance.However, the variance can be reduced by an ensemble of ANNs. From
the MSE equation,Naftaly et al. (1997)obtained two conclusions:
(1) The bias of the ensemble averaged function is exactly the same
as that of the function connected to a single NN.
(2) The variance of the ensemble averaged function is less than
that of the function connected to a single NN.
These theoretical ndings indicate that ensembles of ANNs can
easily reduce the variance with less cost from the bias. Therefore, an
effective approach is to select or create a set of nets or experts that
show high variance but low bias, because with combining, we can
reduce the variance. Several methods have beenemployed for creating
committee members. Generally, these methods can be divided into
three categories:
(1) Method to select diverse training data sets from the original
source data
(2) Method to create different experts or individual neural
network
(3) Method to combine these individuals and their results
2. Some methods for constructing committee member
In this section, we introduce some methods for committee
member construction as mentioned in part 1. There are some
approaches that have been used for selecting training data set by
Journal of Petroleum Science and Engineering 76 (2011) 217223
Corresponding author. Tel.: +60 124422445.
E-mail addresses:sajkenari@yahoo.com(S.A. Jafari),syamsiah@eng.upm.edu.my
(S. Mashohor),jalali@inspem.upm.edu.my(M.J. Varnamkhasti).
0920-4105/$ see front matter 2011 Elsevier B.V. All rights reserved.
doi:10.1016/j.petrol.2011.01.006
Contents lists available at ScienceDirect
Journal of Petroleum Science and Engineering
j o u r n a l h o m e p a g e : w w w. e l s ev i e r. c o m / l o c a t e / p e t r o l
http://dx.doi.org/10.1016/j.petrol.2011.01.006http://dx.doi.org/10.1016/j.petrol.2011.01.006http://dx.doi.org/10.1016/j.petrol.2011.01.006mailto:sajkenari@yahoo.commailto:syamsiah@eng.upm.edu.mymailto:jalali@inspem.upm.edu.myhttp://dx.doi.org/10.1016/j.petrol.2011.01.006http://www.sciencedirect.com/science/journal/09204105http://www.sciencedirect.com/science/journal/09204105http://dx.doi.org/10.1016/j.petrol.2011.01.006mailto:jalali@inspem.upm.edu.mymailto:syamsiah@eng.upm.edu.mymailto:sajkenari@yahoo.comhttp://dx.doi.org/10.1016/j.petrol.2011.01.006 -
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
2/7
varying the source data sets. Bagging, noise injection, cross-validation,
stacking and boosting are the most common techniques. Other
methods have been used by a researcher to construct committee
members are: Fuzzy Logic (FL) with a different fuzzy inference
system, Genetic Algorithm (GA), Neuro-Fuzzy, empirical formula and
etc.Kadkhodaie-Ilkhchi et al. (2009a), have used the neural network,
fuzzy logic and fuzzy neural network as a committee member.Chen
and Lin (2006) have used three empirical formulas as committee
members. Kadkhodaie-Ilkhchi (2009b), Rezaee et al. (2009) have
used a back-propagation neural network with the different training
algorithm for the construction committee neural network. In this
paper, we have used ve signicant training algorithms in articial
neural network as committee members. They were Levenberg
Marquardt (LM), Bayesian Regularization (BR), One Step Secant
(OSS), Resilient Back Propagation (RP), and Scaled Conjugate Gradient
(SCG).The followingis a brief description of some important methods to
create committee members.
2.1. Bagging (Breiman, 1996)
One of the important methods of manipulating the data set and
creates M training sets is bootstrap aggregation or bagging. The
basic idea of bagging is to generate a collection of experts, such
that every expert uses bootstrap training data set. Given a data set,X=(x1,x2,...,xn), bootstrap sampling means to create N new data set
X1,X2,X3,...,XNsuch that every Xi is generated by randomly picking n
data point xi of X. It is clear, in creating Xi, some xi of X may be
repeated and some xi may be ignored. In bagging, we repeat this
learning algorithm to create M different training sets for M experts.
The bagging method is designed to reduce the error variance, and it is
very efcient in constructing a set of training data when the source
data size is small.
2.2. Noise injection (Raviv and Intrator, 1996)
As mentioned before, simple bootstrap generates several training
data using the source data, all with the same size. Efron and Tibshirani
(1993), noted that, bootstrap can also be viewed as a method forsimulating noise inherent in thedata andthus, increase effectively the
number of training patterns. Raviv and Intrator (1996), presented
another algorithm in a bootstrap, which is Bootstrap Ensemble with
Noise (BEN).In this method, a variable amountof noise is added to the
input data before using the bootstrap sampling to ensemble training
sets. This method can effectively reduce the variance, since the
injection of noise increases the independence between the different
training sets derived from the sourcedata sets. Bhatt and Helle (2002)
have used noise injection for the construction committee members.
2.3. Cross-validation (Krogh and Vedelsby, 1995)
In this method the available data set is partitioned into M disjoint
andequalsubsets.Afterthat we select oneof these subsets as a test set
and the (M-1) remaining data set as the training data. After M time
repeating, we have M numbers of overlapping training sets and M
independent test sets. Since the training sets are different, then the
generated errors after training are expected to fall in different local
error minima and therefore lead to different results. The performance
of experts is measured on the corresponding test data set. Breiman,
Friedman, Olshen and Stone, use cross-validation to prune classica-
tion tree algorithms.
2.4. Stacking (Wolpert, 1992)
The rst part of the stacking method is similar to the cross-
validation method. As mentioned above,there is an M training set and
an M test set. After that we use the M training sets to train two
generalizersG1,G2and then the M test set is put intoG1andG2, (these
outputs will be used as second space generalizer inputs). The output
ofG1 and G2andtarget value, (g1i,g2i,yi) will beusedas the training set
of generalizerG as a second space generalizer.
2.5. Boosting byltering (Schapire, 1990), AdaBoost (Freundand Schapire)
In this method, there are three experts. The rst expert is trained
with the M training data of the source training data set and the result
of the rst expert will be applied to the second expert. After that, the
second expert will be trained with this data set. After training the
second expert, the training data of the sourcedata will be passedfrom
the rst and second experts. Finally the third expert will be trained
onlyon the data set inwhich the output of the rstand secondexperts
is disagreed. That means, if there are disagreements between the rst
and second experts on a certain data, this data will be passed to the
third expert. The nal result is related to the outputs of the three
experts. Freund and Schapire (1995); Drucker et al. (1994), have
shown that the boosting algorithm is very effective in many
experiences. Another method of boosting is adaptive boosting. In
this method, the training data will be selected with their probability.
For every data, the predicted value is close to the target value and the
probability to choose this data is low, otherwise the probability is
high. This methodgivesmore chances to such data forretraining.For aclassication problem, we canuse majority votingand fora regression
problem the result with lowest error rate is selected. AdaBoost is
sensitive to noisy data and outliers, but it is less sensitive to the over
tting for most learning algorithms.
3. Combination methods
Thelast stage of designCommittee Machine (CM) is thecombination
of the expert outputs. Many investigations have been done to nd the
combining methods to combine the expert outputs and produce the
nal outputs. In this section, we have introduced some traditional
combining methods in the CM. Some of them are suitable for the
classier and some of them performed well in regression.
3.1. Simple averaging (Lincoln and Skrzypek, 1990)
One of the most frequently used combination methods is simple
averaging. In this method after training the committee members, the
nal output canbe obtained by averaging the output of thecommittee
members. It is easy to illustrate by Cauchy's inequality which the
Mean Square Error (MSE) for committee machine with the simple
averaging method is less or equal than the average of MSE for every
expert. This method is more useful when the variances of the
ensemble members are different, because the simple averaging can
reduce the variance of the nets. The disadvantage of simple averaging
is the equal weight for every committee member, i.e. there is no
difference between the weights of two committee members with low
and high generalizations.
NN 1
NN 2Input
X(n)
Output
Y(n)
NN k
Combin
)(
)(
)(
2
1
ny
ny
ny
k
Fig. 1.Committee neural network with k members.
218 S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223
-
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
3/7
3.2. Weighted averaging (Jacobs, 1995)
In this method, every committee member has a suitable weight
related to their ability to generalization. InJacobs (1995) the researcher
introduced a gating method to determine the weight of every expert.
The authors Opitz andShavlik (1996)have usedGenetic Algorithm (GA)
to determinethe weight of each member. To obtain theoptimal weights
for combining with the GA algorithm, the tness function is dened as
below:
MSEGA = n
i = 1
1
n w1y1i + w2y2i + :::+ wkykiTi
2;
k
i = 1wi = 1 1
where, y1i is the output of the rst network on the ith input or ith
training pattern, w i is the weight of the ith member, Ti is the target
value of thei-thinput, andn is the number of training data.
3.3. Majority voting (Hansen and Salamon, 1990)
This combination method is most popular for classication
problems. If more than half of the individuals vote for a prediction,
majority votingwill selectthis predictionto be thenal output.Majority
voting ignores the fact that some networks that lie in a minority
sometimescan produce the correct results. At this stage of combination,
it ignores the existenceof diversity that is the motivation forensembles.
3.4. Ranking (Ho et al., 1994; Al-Ghoneim and Kumar, 1996)
This method uses experimental results obtained by a set of experts
on a set of dataset to generate a ranking of those experts (each expert
has a rank related with an input dataset). After that the results of the
ranks of each expert will be calculated by some methods such as
average rank, success rate ratio, and signicant wins to generate nal
ranking for experts. The nal rank can be used to select one or more
suitable experts for a test (unseen) data (Brazdil and Soares, 2000).
There are no unique criteria on the selection of the mentioned
combination methods. The choice mainly depends on the characteristic
of the particular application that we have in hand, e.g. the nature of theapplication (classier or regression), the size and quality of thetraining
data and the generated errors on the region of the input space. Using
thesame combination methodon an ensemble fora regression problem
may generate good results. However, it may not work on a classication
problem and vice versa. Much work has been done to introduce
combining method in ensemble approaches. Major contribution in
ranking is as the weighted majority voting (Kuncheva, 2004), decision
templates (Kuncheva et al., 2001), naive Bayesian fusion (Xu et al.,
1992), Dumpster Shafer combination (Ahmadzadeh and Petrou, 2003)
and Fuzzy integral (Cho and Kim, 1995).
4. Fuzzy genetic algorithm (FGA) for combining
Genetic Algorithm (GA) is a search optimization technique thatmimics some of the processes of natural selection and evolution. In
optimization, when a GA fails tond the global optimum, the problem
is often credited to premature convergence, which means that the
sampling process converges on a local optimum rather than the global
optimum. Sexual selection by means of female preferences has
promoted the evolution of complex male ornaments in many animal
groups. A sex-determination system is a biological system that
determines the expansion of sexual characteristics in an organism.
Most sexual organisms have two sexes. In many cases, sex determina-
tion is genetic: males and females have different allelesor even different
genes that state their sexual morphology. In a classical GA, chromo-
somes reproduce asexually: any two chromosomes may be parents in
crossover. Gender division and sexual selection inspired a model of
gendered GA in which crossover takes place only between chromo-
somes of an opposite sex. In this study, a relation between the age and
tness as in biological systems affecting the selection procedure is
proposed. A bi-linear allocation lifetime approach is used to label the
chromosomes based on their tness value, which will then be used to
characterize the diversityof the population. Inspired by the non-genetic
sex-determination system that exists in some species of reptiles,
including alligators and some turtles where sex is determined by the
temperature at which the egg is incubated, we divided the population
into two groups, male and female, so that the male and female can beselected in an alternate way. In each generation, the layout of the
selection of male andfemale is different. Duringthe sexual selection, the
male chromosome is selected randomly. The label of the selected male
chromosome and the population diversity of the previous generation
are then applied within a set of fuzzy rules to select a suitable female
chromosome (Jalali and Lee, 2009). Fuzzy systems are encountered in
numerous areas of application. Fuzzy rules, for example, viewed as a
generic mechanism of grainy knowledge representation, are positioned
in the center of the knowledge-based systems. A fuzzy IF-THEN rule
consists of an IF part (antecedent) and a THEN part (consequent).
The antecedent is a combination of terms, whereas the consequent is
exactly oneterm. In theantecedent, thetermscan be combined by using
fuzzy conjunction, disjunction and negation. A term is an expression
of the form: X=T, where X is a linguistic variable and T is one of its
linguistic terms. In this paper, we use a linguistic variable age for
chromosomes.Fig. 2describes the linguistic variable age where Infant,
Teenager, Adult and Elderly are the linguistic values.
The system applied in our study uses triangular membership
functions, the (minimum) intersection operator and correlation-
product inference procedure. Defuzzication of the outputs was
performed using the fuzzy centroid method described by Kosko
(1992). To nd the membership function, we use the tness value of
each chromosome and the minimum, maximum and average tness
values of the population in each generation. Each chromosome has
its own label determined by the age function. Let
= fifminfavrfmin
; = fifavrfmaxfavr
; = favrfi 2
wherefi=tness value of chromosomei,favr=averagetness value,
fmin=minimum tness value, and fmax=maximum tness value of
population. Then the age function is:
age ci =
L+
n ; 0
+
n ; b0
8>>>:
3
or
age ci =
U L +
n ; 0
U +
n ; b 0
8>>>:4
where, ci=Chromosome i, L =Minimum age, U=maximum age,
n =Population size, =UL
2 ,=
U+ L
2 , and, , are dened in
Eq.(2).
Linguistic variable
Syntactic rules
Linguistic terms
Age
Infant Teenager Adult Elder ly
Fig. 2.The linguistic variable
age
.
219S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223
-
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
4/7
Eq.(3)is suited for maximization problems which relate to higher
tness values while Eq. (4) is suited for minimization problems which
relate to lower tness values. This idea is inspired by the idea of
the lifetime proposed in Arabas et al. (1994). The fuzzication
interface denes the possibilities of the four linguistic values for each
chromosome: {Infant, Teenager, Adult, and Elderly}. These values
determine the degree of truth for each rule premise. This computation
takes into account all chromosomes in each generation and relies on
the triangular membership functions shown inFig. 3, withL =2 and
U=10. A bi-linear allocation lifetime approach proposed in Kosko
(1992) is used to label the chromosomes based on their tness
value which will then be used to characterize the diversity of the
population.
D ci = L+ ; 0+ ; b 0
:
5
Let=label of half of the population, then the population can be
divided into four levels, Very Low, Low, Medium and High diversity as
follows:
PopulationDiversity
=
High L+ tMedium L+ tb L+ t+ 1
Low L+ t+ 1b L+ t+ 2Very Low NL + t+ 2
8>>>:
6
where, t= L+ U
n
is a parameterthat hasa correlation with the
domain of labels in the population and = n
10h i, (where [x] means
nearest integer number to x, for example [2.3]=2 and[2.8]=3). This
computation is performed in every generation and relies on the
triangular membership functions shown in Fig. 4. The inputs are
combined logically using the AND operator to produce output
response values for all expected inputs. A ring strength for each
output membership function is computed. All that remains is to
combine these logical sums in a defuzzication process to produce the
crisp output. The fuzzy outputs for all rules are nally aggregated to
one fuzzy set. To obtain a crisp decision from this fuzzy output, we
have to defuzzify the fuzzy set. Defuzzication of the outputs was
performed using the fuzzy centroid method of the ring behavior
(Kosko, 1992), which may show that some of the rules are
unnecessary. The number of fuzzy rules in its rule base is 16. Table 1
liststhe fuzzy rules for selecting the female chromosome. Although we
can obtain theFage, we may not be able to nd a female chromosome
that has the exact Fage. We will select a female chromosome having
the nearesttness value toFageto be the parent. In case there is more
than one female chromosome which satises the Fage condition, we
will choose a female chromosome with the highest tness value to
be the parent. This technique is called the Complement Method (Jalali
and Lee, 2009).
5. Case study
In this section, we used a data set from oil wells in Iran. First
several crossplots were generated between well log data and core
permeability to nding, which log has a good relationship with
permeability. With this method, we found a logical relationship
between ve inputs including Sonic transit time (DT), Neutron log
(NPHI), Density log (RHOB), Gamma Ray (GA), and True Formation
Resistivity (Rt) and rock permeability (K) as a target respectively. The
total of the data points divided randomly into three parts, sixty
percent for training, twenty percent forvalidation and twenty percent
for test. Five training algorithms of back propagation neural network
are selected as committee members. They were Levenberg Marquardt
(LM), Bayesian Regularization (BR), One Step Secant (OSS), Resilient
Back Propagation (RP), and Scaled Conjugate Gradient (SCG). As
mentioned above we used ve wireline logs as input data and a core
permeability as output data for analysis of our combining methods.
A brief description of this data set is provided here.
Sonic log(DT): The sonic tool measures the time required for the
transmission of an acoustic wave through a unit of formation thickness.
Sonic transit time (DT) is used bothin theporosity determinationand to
compute secondary porosity in carbonate reservoirs (Service, 1999).Neutron log(NPHI): A radioactivity well log is used to determine
formation porosity. The logging tool bombards the formation with
neutrons. When the neutrons strike the hydrogen atoms in water
or oil, gamma rays are released. Since water or oil exists only in
pore spaces, a measurement of the gamma rays indicates formation
porosity. See radioactivity well logging (Service, 1999).
Density log: (RHOB): A special radioactivity log for open-hole
surveying responds to variations in the specic gravity of formations.
It is a contact log (i.e., the logging tool is held against the wall of the
hole). It emits neutrons and then measures the secondary gamma
radiation that is scattered back to the detector in the instrument. The
density log is an excellent porosity-measure device, especially for
shaley sands (Service, 1999).
Gamma Ray(GR): A type of radioactivity well log records naturalradioactivity around the wellbore. Shales generally produce higher
levels of gamma radiation and can be detected and studied with the
gamma ray tool. See radioactivity well logging (Service, 1999).
True formation resistivity (Rt): With reference to log analysis, the
resistivity of the undisturbed formation. It is derived from a resistivity
log that has been corrected as far as possible for all environmental
effects, such as borehole, invasion and surrounding bad effects. Hence,
it is taken as the true resistivity of the undisturbed formation in situ
and is called Rt. With reference to the core analysis, the resistivity
1
00.45 0.650.25
Infant Teenager Adult Elderly
0.85(ci)age
Fig. 3.The age linguistic variable for male and female.
1
04.52.5 6.5
High Medium Low Very Low
8.5D(ci)
Fig. 4.The population diversity linguistic variable.
Table 1
Fuzzy rules for selecting female chromosome.
Male age
(Mage)
Diversity Female age
(Fage)
Male age
(Mage)
Diversity Female age
(Fage)
Infant High Elderly or adul t Adul t High Elderly or adult
Medium Adult or teenager Medium Adult or teenager
Low Teenager or infant Low Teenager or infant
Very low Infant Very low Infant
Teenager High Elderly or adult Elderly High Adult or teenager
Medium Adult or teenager Medium Teenageror infantLow Teenager or infant Low Infant
Very low Infant Very low Infant
Male Female Randomly numbers
Fig. 5.The technique for two points cut in offspring.
220 S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223
-
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
5/7
-
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
6/7
Mutation is performed in four steps:
(1) A random real number from an interval (0, 1) is generated for
the probability of mutation.
(2) GA considers thisprobability andsome chromosomesare selected.
(3) For eachchromosome that is selected, a random natural number
k, varying from 1 to the number of genes in the chromosome is
generated.
(4) The gene numberk is replaced by another randomly-generatedgene.
A standard GA is used in this experiment with a population size
of 20, and the total length of the chromosomes is 85 bits. Crossover
probability,pc=0.50 and mutation probability,pm=0.02. Each test
function is tested on the GA for 30 times with a maximum of 5000
generations per each run.Fig. 6 shows the variable of female's age
when male's age and diversity are changing. Fig. 7shows when the
male's age chromosomes are increasing, the system for female's age
chromosome is considering decreasing age. Fig. 8 shows that when
the diversity of population is increasing, the system for female's agechromosomes is considering the age decrement. This technique will
Fig. 9.(a
f). Crossplot showing R between core and predicted permeability using ve training algorithms and FGA.
222 S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223
http://localhost/var/www/apps/conversion/tmp/scratch_4/image%20of%20Fig.%E0%B9%80 -
7/25/2019 Committee neural networks with fuzzy genetic algorithm.pdf
7/7
maintain the diversity of the population and then GA cannot converge
very soon, and premature convergence will be avoided. Fig. 9(af)
shows the correlation coefcient between the core and predicted
permeabilities from ve training algorithms and FGA methods.
Table 2, shows both MSE and R2 values for overall data points using
ve training algorithms, GA and weighted averaging (FGA). This
table helps us to decide, which combining model is better in its
performance. A good combining scheme should have a higher R2 and
lowest MSE.
6. Conclusion
There are different ways of combining the intelligent system's
outputs in the combiner in the committee neural network, one of
these methods is Genetic Algorithm (GA). The failure to nd good
results when we use GA in a CM is highly due to premature
convergence. The population diversity in GA is an important
parameter on premature convergence. A technique for controlling
the population diversity using Fuzzy rules and sexual selection is
proposed in this paper. In conclusion, the female choiceby Fuzzy logic
is a suitable way for improving the performance of GAs in keeping
with the diversity of the population and premature convergence can
be eliminated. In this paper, we used the FGA methods for combining
the output of experts to the prediction of permeability in the oil
industry. From the simulation results, the correlation coefcient and
MSE forve training algorithms are shown inTable 2.T he R2 and MSE
for GA combining method are 0.8438 and 0.001 respectively, whichare better than all training algorithms. With applying FGA method to
combining, the correlation coefcient and MSE are improved, which
are 0.8523 and 0.00092 respectively.
References
Ahmadzadeh, M.R., Petrou, M., 2003. Use of DempsterShafer theory to combineclassiers which use different class boundaries. Pattern Anal. Appl. 6 (1), 4146.
Al-Ghoneim, K.A., Kumar, B.V., 1996. Combining neural networks using the rankinggure of merit. Proc. SPIE 2760, 213.
Arabas, J., Michalewicz, Z., et al., 1994. GAVaPSa genetic algorithm with varyingpopulation size. Evolutionary Computation, 1994: IEEE World Congress onComputational Intelligence, vol.1, pp. 7378.
Bhatt, A., Helle, H.B., 2002. Committee neural networks for porosity and permeabilityprediction from well logs. Geophys. Prospect. 50 (6), 645660.
Brazdil, P., Soares, C., 2000. A Comparison of Ranking Methods for ClassicationAlgorithm Selection, pp. 6375.
Breiman, L., 1996. Bagging predictors. Mach. Learn. 24 (2), 123140.Chen, C.-H., Lin, Z.-S., 2006. A committee machine with empirical formulas for
permeability prediction. Comput. Geosci. 32 (4), 485496.Cho, S.-B., Kim, J.H., 1995. An HMM/MLP architecture for sequence recognition. Neural
Comput. 7 (2), 358
369.Drucker, H., Cortes, C., et al., 1994. Boosting and other ensemble methods. NeuralComput. 6 (6), 12891301.
Efron, B., Tibshirani, R.J., 1993. An Introduction to the Bootstrap. [u.a.] Chapman & Hall,New York. 1993.
Freund, Y., Schapire, R., 1995. A decision-theoretic generalization of on-line learningand an application to boosting. European Conference on Computational LearningTheory, pp. 2337.
Hansen, L.K., Salamon, P., 1990. Neural network ensembles. IEEE Trans. Pattern Anal.Mach. Intell. 12 (10), 9931001.
Haykin, S., 1999. Neural NetworksA Comprehensive Foundation, Upper Saddle River.Prentice-Hall, NJ.
Ho, T.K., Hull, J.J., et al., 1994. Decision combination in multiple classi er systems. IEEETrans. Pattern Anal. Mach. Intell. 16 (1), 6675.
Jacobs, R.A., 1995. Methods for combining experts' probability assessments. NeuralComput. 7 (5), 867888.
Jalali, M., Lee, L.S., 2009. Fuzzy genetic algorithm with sexual selection (FGASS). SecondInt. Conf. and Workshop on Basic and Applied Science, 24 June, Johor Bahru,Malaysia.
Kadkhodaie-Ilkhchi, A., Rahimpour-Bonab, H., et al., 2009a. A committee machinewith intelligent systems for estimation of total organic carbon content frompetrophysical data: an example from Kangan and Dalan reservoirs in South ParsGas Field, Iran. Comput. Geosci. 35 (3), 459474.
Kadkhodaie-Ilkhchi, A., Rezaee, M.R., et al., 2009b. A committee neural network forprediction of normalized oil content from well log data: an example from SouthPars Gas Field, Persian Gulf. J. Petrol. Sci. Eng. 65 (12), 2332.
Kosko, B., 1992. Neural Networks and Fuzzy Systems: A Dynamical Systems Approachto Machine Intelligence. Prentice-Hall, Inc, p. 449.
Krogh, A., Vedelsby, J., 1995. Neural network ensembles, cross validation, and activelearning. Adv. Neural Inf. Process. Syst. 7, 231238.
Kuncheva, L.I., 2004. Classier Ensembles for Changing Environments, pp. 115.Kuncheva, L.I., Bezdek, J.C., et al., 2001. Decision templates for multiple classier fusion:
an experimental comparison. Pattern Recognit. 34 (2), 299314.Lincoln, W., Skrzypek, J., 1990. Synergy of clustering multiple back propagation
networks. Adv. Neural Inf. Process. systems 2, 650657.Naftaly, U., Intrator, N., Horn, D., 1997.Optimal ensembleaveraging of neural networks.
Network 8, 283296.Opitz, D.W., Shavlik, J.W., 1996. Actively searching for an effective neural network
ensemble. Connect. Sci. 8 (3), 337354.Raviv, Y., Intrator, N., 1996. Bootstrapping with noise: an effective regularization
technique. Connect. Sci. 8 (3), 355372.Rezaee, M.R., 2001. Petroleum Geology. Alavi Publications, Tehran, Iran.Schapire, R.E., 1990. The strength of weak learnability. Mach. Learn. 5 (2), 197227.Service, U. o. T. a. A. P. E., 1999. A Dictionary for the Petroleum Industry. Petroleum
Extension Service.Wolpert, D.H., 1992. Stacked generalization. Neural Netw. 5, 241259.Xu, L., Krzyzak, A., et al., 1992. Methods of combining multiple classi ers and their
applications to handwriting recognition. Syst. Man Cybern. IEEE Trans. 22 (3),418435.
Table 2
The comparison of MSE and R2 for test data using ve training algorithm, GA and FGA.
Algorithm R 2 MSE
LM 0.8274 0.0012
BR 0.8239 0.0012
OSS 0.7257 0.0015
RP 0.751 0.0016
SCG 0.7885 0.0015
GA 0.8438 0.001
FGA 0.8523 0.00092
223S.A. Jafari et al. / Journal of Petroleum Science and Engineering 76 (2011) 217223
top related