commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · available online...

21
Fuzzy Sets and Systems 152 (2005) 565 – 585 www.elsevier.com/locate/fss Commutativity as prior knowledge in fuzzy modeling P. Carmona a , , J.L. Castro b , J.M. Zurita b a Departmento de Informática, Universidad de Extremadura, E. Ing. Industriales, Avda. Elvas, s/n, 06017 Badajoz, Spain b Departmento Ciencias de la Computación e I.A., Universidad de Granada, E.T.S.I. Informática, 18071 Granada, Spain Received 30 January 2004; received in revised form 13 September 2004; accepted 9 November 2004 Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the training set is crucial to properly grasp the behavior of the system being modeled. However, the available data are often not large enough or are deficiently distributed along the input space, not revealing the system behavior completely. In such cases, the consideration of any prior knowledge about the system can be decisive for the accuracy achieved by the fuzzy modeling. This paper faces with the integration of mathematical properties satisfied by a system as prior knowledge in FM, focusing on the commutativity property as a starting point. With this aim, several measures are developed to evaluate the commutativity in a fuzzy environment dealing with different elements involved in FM. Then, several approaches are proposed to measure the commutativity degrees of a fuzzy rule with respect to the training set and a simple method is presented to integrate these degrees into the FM task. The experimental results show the accuracy improvement gained by the proposed method. © 2004 Elsevier B.V.All rights reserved. Keywords: Fuzzy system models; Prior knowledge; Commutativity; Fuzzy measures 1. Introduction Out of the different existing techniques for system identification, FM [4,13] is noted for building from a set of examples—the training set—a model which represents the system by means of fuzzy rules. This This work has been supported by Research Project PB98-1379-C02-01. Corresponding author. Tel.: +34 924 289300; fax: +34 924 289601. E-mail addresses: [email protected] (P. Carmona), [email protected] (J.L. Castro), [email protected] (J.M. Zurita). 0165-0114/$ - see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.fss.2004.11.004

Upload: others

Post on 27-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

Fuzzy Sets and Systems152 (2005) 565–585www.elsevier.com/locate/fss

Commutativity as prior knowledge in fuzzy modeling�

P. Carmonaa,∗, J.L. Castrob, J.M. ZuritabaDepartmento de Informática, Universidad de Extremadura, E. Ing. Industriales, Avda. Elvas, s/n, 06017 Badajoz, SpainbDepartmento Ciencias de la Computación e I.A., Universidad de Granada, E.T.S.I. Informática, 18071 Granada, Spain

Received 30 January 2004; received in revised form 13 September 2004; accepted 9 November 2004Available online 30 November 2004

Abstract

In fuzzy modeling (FM), the quantity and quality of the training set is crucial to properly grasp the behaviorof the system being modeled. However, the available data are often not large enough or are deficiently distributedalong the input space, not revealing the system behavior completely. In such cases, the consideration of any priorknowledge about the system can be decisive for the accuracy achieved by the fuzzy modeling.This paper faces with the integration of mathematical properties satisfied by a system as prior knowledge in FM,

focusing on the commutativity property as a starting point.With this aim, several measures are developed to evaluatethe commutativity in a fuzzy environment dealing with different elements involved in FM.Then, several approaches are proposed to measure the commutativity degrees of a fuzzy rule with respect to the

training set and a simple method is presented to integrate these degrees into the FM task.The experimental results show the accuracy improvement gained by the proposed method.

© 2004 Elsevier B.V. All rights reserved.

Keywords:Fuzzy system models; Prior knowledge; Commutativity; Fuzzy measures

1. Introduction

Out of the different existing techniques for system identification, FM[4,13] is noted for building froma set of examples—the training set—a model which represents the system by means of fuzzy rules. This

� This work has been supported by Research Project PB98-1379-C02-01.∗ Corresponding author. Tel.: +34924289300; fax: +34924289601.E-mail addresses:[email protected](P. Carmona),[email protected](J.L. Castro),[email protected](J.M. Zurita).

0165-0114/$ - see front matter © 2004 Elsevier B.V. All rights reserved.doi:10.1016/j.fss.2004.11.004

Page 2: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

566 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

representation paradigm provides models that are universal approximators[10] and describes linguisti-cally the behavior of the system.In FM, the training set is often the main available information to describe the behavior of the system

and, when any prior knowledge is available, it usually focuses on the model structure, that is, the numberof input and output variables, the granularity of the fuzzy domains, etc. Thus, the effectiveness of FMstrongly depends on the informative content of the training set [12]. Unfortunately, this training set is oftennot enough to attain an accurate and complete description of the system: sometimes, the examples arecontaminated by perturbations—noise—that degrade their informative quality; sometimes, the trainingset is too small to appropriately cover the input space of the system, or the examples are deficientlydistributed. In all these cases, the behavior of the system is not completely revealed, preventing fromproperly modeling it in certain regions.A way to attenuate these deficiencies consists in adding any available prior knowledge about the

system to the knowledge contained in the examples, enhancing their informative content. One suchprior knowledge can consist in restrictions that the system complies, which can be described as one ormore properties completely or partially satisfied by the entire system or by a part of it. For example, asystem whose output remains invariant when the values of two input variables are exchanged satisfies acommutativity property [6];1 a system in which small changes in the input cause small changes in theoutput satisfies a consistency—smoothness—property [5,8]; a system whose output increases when oneof its inputs increases satisfies amonotonicity property relating both variables [21]; in pattern recognition,the mapping between an object and the type of this object usually satisfies the invariance property overscaling and rotation of this object [17,19]; foreign exchange markets comply with a symmetry property[2]; etc.In the literature, two main strategies have been proposed to integrate such a prior knowledge in system

identification. The first one uses this knowledge to determine the structure of the model [9,15,18,19,22].The second one uses the knowledge along the learning process, either by extending the example set withvirtual examples on the basis of the properties the system obeys [1,3,17], or by imposing certain penaltyfunctions that punish the models not complying with the properties [21]. However, for the time being, allthese techniques have been used in the neural-net framework.In this paper, we propose the integration of this prior knowledge into FM by evaluating the degree of

satisfaction of the properties directly on the rules in the fuzzy model, acting as an additional criterion forthe selection of the rule base. This requires to definemeasures of the degree of satisfaction of a property ina fuzzy environment.We will focus particularly on the commutativity property, but an equivalent strategycan be developed for other local properties—such as symmetry and other invariances—involving onlytwo input regions (see below).Thenext section states thenotation usedalong thepaper. InSection 3 the commutativity propertywill be

formally defined and its selection will be justified. Section 4 introduces several commutativity measuresin a fuzzy environment combining fuzzy rules, fuzzy rule bases, and examples. Section 5 proposes away to include the commutativity in FM, using a well-known method whose simplicity will allow us tofocus on the benefit of the proposal. These results are analyzed in Section 6, comparing commutativitynon-sensitive FM (CNS-FM) methods with the proposed contradiction sensitive FM (CS-FM) methodwhen applied to several numerical examples. Finally, concluding remarks can be found in Section 7.

1 In [6] a first approach to the ideas presented in this paper is outlined. These ideas are extended here with new measures andwith a thorough analysis of them and the computational results.

Page 3: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 567

2. Notation

We consider multiple input single output (MISO) systems2 with n input variablesX = {X1, . . . , Xn}defined over the input domain of discourseX = X1 × · · · × Xn and one output variableYdefined overthe output domain of discourseY. A general crisp value is denoted asxi for theith input variableXi andasy for the output variableY.The fuzzy domain of theith input variableXi is denoted asXi = {LXi,1, . . . , LXi,pi

}, wherepi is thenumber of fuzzy values associated with the variable andLXi,j represents both the membership functionand the linguistic label of thejth value. Analogously,Y = {LY1, . . . , LYq} is the output fuzzy domain,qbegin the number of fuzzy values andLYj both thejth membership function and label. A specific inputfuzzy value such asNS—negative small—is subscripted asNSi for the ith input variable. The specificoutput fuzzy values are not subscripted, since single output systems are being considered.The ith fuzzy rule is denoted as

Ri1...inLY i : LX1,i1, . . . , LXn,in → LY i

using a superscript for the consequentLY i in order to distinguish it fromLYi (theith value of the outputdomainY). A general output fuzzy value is simply denoted asLY.A training set withm examples is denoted asE = {e1, . . . , em}. The ith example takes the form

ei = (xi , yi), wherexi = [xi1, ..., x

in] ∈ X is the vector of input values andyi ∈ Y is the output value.

3. Commutativity in a system

The commutativity property has two main advantages that make it appropriate as a first approach tothis research topic. On the one hand, its definition is very simple, since it only involves two input variablesin its basic form. On the other hand, it is a local property, only relating two input regions of the system—designated to here as theanalyzedand thecomplementaryregions—, whose measures only need to takeinto account the features of these two regions. Nevertheless, all the considerations made in this paper canbe easily extrapolated to other local properties and also, in a further extension, to the global ones.Commutativity can be present in a lot of systems. For example, such is the case of systems with two

or more inputs associated with similar information and such that the output depends on the total of theseinputs instead of on each of them separately (e.g., a traffic control problem in an intersection, where thewaiting time in a direction will depend on the total traffic in both ways of the other direction but notindependently on each of both). Another example is the well-known truck backer-upper problem[11],where the subsystem

x(t + T ) = x(t) + V T cos(�(t)) cos(�(t)) (1)

describing thex position of the truck is commutative with respect to the trailer axis angle� and thesteering angle�. In the first case, a waiting time known for certain traffic conditions in one and theopposite directions can be used to establish the waiting time when the traffic conditions in both directionsare interchanged. In the second case, the position of the truck after a movement with steering angle�′

2 Since a multiple input multiple output (MIMO) system can be separated into a group of MISO systems[16], the results canbe easily extrapolated to the MIMO case.

Page 4: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

568 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

and trailer axis angle�′ allows us to know the position of the truck when steering angle is�′ and traileraxis angle is�′. Thus, the information provided by the available data in those problems can be enrichedregarding the commutativity of the systems.

Definition 1. A systemf : X1× · · · × Xn → Y complies with abivariate commutativity propertyovertwo input variablesXl1 andXl2 if and only if

f (X1, . . . , Xl1, . . . , Xl2, . . . , Xn) = f (X1, . . . , Xl2, . . . , Xl1, . . . , Xn). (2)

The above definition states the basic form of commutativity. A multivariate commutativity involvingmvariables§ = {Xl1, . . . , Xlm} will be present when any permutation of these variables provides the sameoutput value. Nevertheless, it can be easily proved that this multivariate commutativity can be expressedas the simultaneous fulfillment of two different commutativities involving(m−1) variables in§ and it canbe ultimately decomposed as the simultaneous fulfillment of(m − 1) bivariate commutativities.3 Thus,a single value for a multivariate commutativity can be obtained by decomposing it into a conjunction ofbivariate commutativities and using some aggregation method (e.g., a convex sum). Because of that, inthe sequel, we use commutativity to refer to bivariate commutativity.

4. Fuzzy measures of commutativity

The inclusion of the commutativity as an additional criterion to guide FM requires to define themeasures that assess the degree of satisfaction for that property in a fuzzy environment. In this section,three types of measures are defined considering three different elements involved in FM: rules, rule bases,and examples.

4.1. Commutativity between two fuzzy rules

Firstly, the commutativity degree will be defined between the basic elements of a fuzzy model: therules. It allows to assess the degree in which two rules comply with the commutativity over two inputvariables.Given the fuzzy rules

Ri1...il1...il2...in

LY i :LX1,i1, . . . , LXl1,il1, . . . , LXl2,il2

, . . . , LXn,in → LY i,

Rj1...jl1...jl2...jn

LY j :LX1,j1, . . . , LXl1,jl1, . . . , LXl2,jl2

, . . . , LXn,jn → LYj ,

the commutativity between such rules over the input variablesXl1 andXl2 can be expressed logically as

If (LXl1,il1= LXl2,jl2

) and(LXl2,il2= LXl1,jl1

) then(LY i = LYj ).

3 For example, iff : X1 × X2 × X3 → Y is commutative with respect to{X1, X2, X3} then it can be expressed as thesimultaneous fulfillment of two bivariate commutativities, e.g. over{X1, X2} and{X1, X3}; if f : X1 × X2 × X3 × X4 → Yis commutative with respect to{X1, X2, X3, X4} then it can be expressed as the simultaneous fulfillment of the trivariatecommutativities over{X1, X2, X3} and{X1, X2, X4}, and thus, as the conjunction of three bivariate commutativities, e.g. over{X1, X2}, {X1, X3}, {X1, X4}; and so on.

Page 5: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 569

The degree of fulfillment of that logical rule can be obtained in fuzzy logic by using the appropriate fuzzyoperators of similarity, conjunction, and implication. A particular measure would be

CRl1,l2(R

i1...inLY i , R

j1...jnLY j )

= [S(LXl1,il1, LXl2,jl2

) ∗t S(LXl2,il2, LXl1,jl1

)] → S(LY i, LY j ), (3)

whereS is a similarity measure between fuzzy sets,∗t is at-norm and→ is anR-implication.4

4.2. Commutativity of a fuzzy rule with respect to a fuzzy rule base

The previous measure provides a commutativity degree between two isolated rules. Nevertheless,although this definition has allowed us to try a first approach to the concept of commutativity in a fuzzyenvironment, when considering a fuzzy model the rules do not act independently. Instead, they interactwith one another due to the overlapping among antecedents, allowing for the interpolative capabilities offuzzymodels. Determiningwhich pairs of rulesmust be compared between both input regions involved inthe evaluation of the commutativity is not a trivial question, because it depends on the semantics definedover the input variables (i.e., the input fuzzy domains). These facts are shown graphically in the fuzzymodel depicted in Fig. 1: on the onehand, the analyzed region[NL1, Z2]delimited by the ruleR1,3NS affectsitself and its 5 neighboring rules—squared rules—, that is, the rules in{NL1, NS1} × {NS2, Z2, PS2};on the other hand, due to the different semantics of the input variablesX1 andX2, the complementaryregion[Z2, NL1] affects 15 rules—circled rules—, that is, the rules in{NL1, NS1, Z1, PS1, PL1} ×{NL2, NS2, Z2}.Because of that, a newmeasuremust be defined appraising the commutativity of a fuzzy rule contained

in a fuzzy rule base with respect to this rule base. This measure will state the degree of suitability of therule to the fuzzy model regarding this property and will allow us to establish a criterion for selecting therules attending to it. With this aim, two measures are proposed in this section.As a first approach, the following idea can be used:the commutativity can be defined as the similarity

between the output surfaces mapped from two commutatively related input fuzzy regions. This will betranslated into the following definition.

Definition 2 (Crisp approach (CA)). Let

Ri1...il1...il2...in

LY i : LX1,i1, . . . , LXl1,il1, . . . , LXl2,il2

, . . . , LXn,in → LY i

be a fuzzy rule,Sik be the discretized support of the fuzzy setLXk,ik , andS= Si1 × · · · × Sin be the setof input vectors obtained from the Cartesian product of the discretized supports of all the fuzzy sets inthe antecedent of the rule. The commutativity degree ofR

i1...inLY i over the input variablesXl1 andXl2 with

respect to a rule baseRB is defined as

CRBl1,l2(R

i1...inLY i ,RB) = 1− 1

r

∑xj∈S

∣∣∣yRB(xj ) − yRB(xjc )

∣∣∣ , (4)

4 a → b = sup{c ∈ [0,1] | a ∗t c�b}

Page 6: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

570 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

-1 -0.8 -0.4 -0.2 0 0.2 0.4 0.8 10

0.5

1

10.

80.

40.

20

-0.2

-0.4

-0.8

-10

0.51

X1

X2

PL1PS1Z1NS1NL1

PL 2

PS

2Z

2N

S2

NL 2

PL

PLPL

PS

ZNSNL NL NS

NS Z

Z

Z

PSPS

PS

PS

PS

PS

NL

Z

NS

NS

NS

NS

Fig. 1. Commutativity evaluation forR1,3NS

involves 6 rules (squared) in the analyzed region[NL1, Z2] and 15 rules (circled) inthe complementary region[Z2, NL1].

wherexj = [xj1, . . . , x

jl1, . . . , x

jl2, . . . , x

jn ] andxjc = [xj

1, . . . , xjl2, . . . , x

jl1, . . . , x

jn ] are commutatively

related input vectors,r is the cardinality ofS, andyRB(x) is the defuzzified and normalized outputmappedinRB from the input vectorx.

Thus, the difference between both output surfaces is estimated by considering some significant pointsfrom the analyzed region. In order to do that, the support of every fuzzy set in the antecedent of the ruleis discretized selecting several points. In this paper, we select three points for each fuzzy set: the centralpoint of its core and the central points of its left and right sides (pointsxC, xL, andxR, respectively, inFig.2). The ends of the support are not considered since in such points the rule being evaluated will not befired and, therefore, they will not reveal differences among different rules located in the analyzed region.Of course, the useof amore densediscretizationwill provide abetter approximation to the output surfaces,but it will also increase the computational cost of the measure, since more inferences are required. Infact, even with a few points per fuzzy set, the measure suffers from thecurse of dimensionality, since thenumber of inferences grows exponentially with the number of inputs (2× 3n inferences).This fact leads us to define a second measure in which the comparison between outputs is made

considering fuzzy outputs instead of crisp ones. In order to do that, firstly they will be inferred both thefuzzy output from the fuzzy input vector expressed in the antecedent of the evaluated rule and the fuzzyoutput from the complementary fuzzy input vector set up by interchanging in the former the fuzzy setsassociated with the commutative input variables. Secondly, the similarity between both fuzzy outputswill be evaluated. This is accomplished with the following definition.

Page 7: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 571

00

0.5

1

left side core right side

xl xc xr

Fig. 2. Support discretization of a fuzzy set.

Definition 3 (Fuzzy approach (FA)). The commutativity of a rule

Ri1...il1...il2...in

LY i : LX1,i1, . . . , LXl1,il1, . . . , LXl2,il2

, . . . , LXn,in → LY i

over the input variablesXl1 andXl2 with respect to a rule baseRB is defined asCRB

l1,l2(R

i1...inLY i ,RB) = S(LYRB(LX ), LYRB(LX c)), (5)

whereLX = [LX1,i1, …,LXl1,il1, …,LXl2,il2

, …,LXn,in] is the fuzzy input vector corresponding to theantecedent of the rule,LX c = [LX1,i1, …,LXl2,il2

, …,LXl1,il1, …,LXn,in] is its complementary fuzzy

input vector,LYRB(LX ) is the normalized fuzzy output mapped inRB from the fuzzy input vectorLX ,andS is a similarity measure between fuzzy sets.

Since the fuzzy output need not to be convex, the similarity measureS in (5) must deal properly withnon-convex fuzzy sets. Zwick et al. [25] and Setnes [20] analyzed several similarity measures and fromthe results herein and some additional experiments we selected the family of Minkowski-basedmeasures:

Sr(A,B) = 1−(∫

X|�A(x) − �B(x)|r dx

)1/r, r�1, (6)

which extends the Minkowski metric to the fuzzy environment and resembles with several well-knowndistancemeasures such as city-block distance (S1), Euclidean distance (S2), or dominance distance (S∞).This family of measures (exceptingS∞) deals properly with non-convex fuzzy sets since it involves allthe membership values of both the fuzzy sets being compared.

Example 1. Suppose we are trying to determine the best-suited rule according to its commutativitydegree for the input fuzzy region[NL1, Z2] with respect to the fuzzy rule base depicted in Fig.1 and thefuzzy domains in Fig. 3. In order to do that, the rulesR

1,3NL, R

1,3NS , R

1,3Z , R1,3PS , andR

1,3PL will be inserted

in turn into the rule base and their commutativity degrees will be measured. Assuming minimumand-operator, Mamdani implication, and Zadeh compositional rule of inference, Fig. 4 shows the output fuzzysets involved in the evaluation of the commutativity degrees when FA is considered, the left columnrepresenting the fuzzy outputs corresponding to each analyzed rule in the input regionLX = [NL1, Z2]and the right column representing the respective fuzzy outputs for the complementary regionLX c =[Z2, NL1]. Additionally assuming the center of gravity defuzzification method, Fig. 5 represents the

Page 8: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

572 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

-1 -0.8 -0.6 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.6 0.8 10

0.5

1

X1

NL1 NS1 Z1 PS1 PL1

-1 -0.8 -0.6 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.6 0.8 10

0.5

1

X2

NL2 NS2 Z2 PS2 PL2

-1 -0.625 -0.25 -0.125 0

0.125

0.25 0.625 10

0.5

1

Y

NL NS Z PS PL

Fig. 3. Fuzzy domains in Example1.

output surfaces to be compared in CA. In this figure, the exact outputs are shown in the upper row andtheir approximations using discretized supports are shown in the lower one. The commutativity degreesfor both measures—using the similarity indexS2 for FA—are the following:

R1,3NL R

1,3NS R

1,3Z R

1,3PS R

1,3PL

CRB1,2 (·,RB) 0.885 0.933 0.881 0.741 0.684

CRB1,2

(·,RB) 0.749 0.799 0.750 0.725 0.725

with similar results in both cases and resulting the ruleR1,3NS the best suited rule according to its commu-

tativity degree. Nevertheless, none of those maximum degrees reaches 1, since the different semanticsassociated with the input variables provide different fuzzy outputs for both input regions involved inthe measure. With identical semantics, the commutativity degree will always result inCRB

1,2(R1,3NS) =

CRB1,2

(R1,3NS) = 1. Besides, it must be noted that, whereas FA performs only 2 inferences, CA needs

2× 32 = 18 inferences.

4.3. Commutativity of a fuzzy rule with respect to a set of examples

In order to introduce the commutativity property into FM, it is necessary to adapt the measure tothe environment where this process is carried out. Mainly, it is composed of a set of examplesE ={e1, . . . , em} (the training set) which relates the input to the output of the system. Then, we must redefine

Page 9: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 573

-1 -0.5 0 0.5 10

0.5

1

R1.3

PL

-1 -0.5 0 0.5 10

0.5

1-1 -0.5 0 0.5 1

0

0.5

1

R1.3

PS

-1 -0.5 0 0.5 10

0.5

1-1 -0.5 0 0.5 1

0

0.5

1

R1.3

Z

-1 -0.5 0 0.5 10

0.5

1-1 -0.5 0 0.5 1

0

0.5

1

R1.3NS

-1 -0.5 0 0.5 10

0.5

1-1 -0.5 0 0.5 1

0

0.5

1

R1.3

NL

LYRB (LX)

LYRB (LXc)

-1 -0.5 0 0.5 10

0.5

1

Fig. 4. Fuzzy outputs inLX = [NL1, Z2] (left column) andLX c = [Z2, NL1] (right column) for rulesR1,3NL,R1,3

NS,R1,3

Z,R1,3

PS,

andR1,3PL.

Fig. 5. Exact (upper row) and approximate (lower row) output surfaces inLX = [NL1, Z2] andLX c = [Z2, NL1] for rulesR1,3NL, R1,3

NS, R1,3

Z, R1,3

PS, andR1,3

PL.

the previous commutativity measures now with respect to a set of examples instead of with respect to apredefined rule base. The simplest strategy consist on translating the examples into a set of rules.

Definition 4. Given a set of examplesE, the rule baseRBE representing them will be a set of rules

Ri1...inLYk

: LX1,i1, . . . , LXn,in → LYk, k = arg maxj =1,...,q �(R

i1...inLYj

)

for all {i1, . . . , in} ∈ p1 × · · · × pn, where� is a certainty measure based onE.

Page 10: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

574 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

This method, which takes the rules with maximum certainty degree in every input region of the fuzzygrid,5 is largely used in the literature (e.g., [14,24]).In this paper we use the mixed method (MM), recently proposed by the authors in [7]. It combines the

Wang and Mendel’s method (WMM) [24] with an extension of the Ishibuchi’s rule generation method[14] that deals with fuzzy consequents. The WMM firstly translates each example into the fuzzy rulebest covering it—i.e., with labels presenting the highest membership degree— and, secondly, once all theexamples are processed, selects the rules with maximum certainty degree from (7) among the conflictingones:

�WM(Ri1...inLY i ) = �LX1,i1

(x1) × · · · × �LXn,in(xn) × �LY i (y). (7)

The MM extends the WMM by adding rules in the input regions where WMM did not identify rules. Ifthere are examples covering one of such fuzzy regions, the rule in this region having themaximum degreefrom (8) will be added:

�I (Ri1...inLY i ) = �(Ri1...in

LY i ) − �(Ri1...inLY i )

q∑k=1

�(Ri1...inLYk

)

, (8)

where

�(Ri1...inLY ) =

∑ej∈E

�LX1,i1(x

j1) × · · · × �LXn,in

(xjn) × �LY (y

j ), (9)

�(Ri1...inLY i ) =

q∑k=1

LYk �=LY i

�(Ri1...inLYk

)

q − 1. (10)

Once the rule base is obtained, the measures presented in Section4.2 can be directly used by inserting therule to be evaluated inRBE and obtaining its commutativity with respect toRBE . Only a modificationof the measures is introduced: the output in the complementary region will be obtainedbeforeinsertingthe rule to be evaluated. This allows to avoid undesired effects provoked by the inserted rule when theanalyzed and the complementary regions overlap, since in such cases the inserted rule can alter greatlythe information provided by the examples in the complementary region. It also reports a computationalbenefit, since only one output must be obtained in the complementary region for all the rules evaluatedin the analyzed region.Next, we present a thirdmeasure further exploiting the training set, which can be viewed as an extension

of CA trying to reduce its computational cost. It takes the examples covering the complementary regionand generates virtual examples [1,3,17] by interchanging the values of the commutative input variables.Then, it estimates the mean error between the expected and the inferred outputs.

5 If no example covers the region[LX1,i1, . . . , LXn,in ] no rule will be selected; if several rules take the maximum certaintydegree, one of them will be selected randomly.

Page 11: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 575

Definition 5 (Example-based approach (EBA)). The commutativity of a fuzzy rule

Ri1...il1...il2...in

LY i : LX1,i1, . . . , LXl1,il1, . . . , LXl2,il2

, . . . , LXn,in → LY i

over the input variablesXl1 andXl2 with respect to a set of examplesE is defined as

CEl1,l2

(Ri1...inLY i , E) = 1− 1

r

∑ej∈E∗

∣∣∣yj − yRBE(xjc )

∣∣∣ , (11)

whereE∗ ⊂ E contains ther examples covering the complementary fuzzy input region

E∗ = {ej ∈ E | (�LXl2,il2(x

jl1) > 0) ∧ (�LXl1,il1

(xjl2) > 0)}, (12)

xjc = [xj1, . . . , x

jl2, . . . , x

jl1, . . . , x

jn ] is a vector set up fromej by interchanging its values in the commu-

tative variablesXl1 andXl2, andRBE is the rule base obtained from Definition4.

The EBA exploits the training set not only by using it to obtainRBE , but also by integrating it directlyinto the measurement. As it will be shown in the experimental results, this can lead to overvalue thatinformation, provoking an overfitting effect that diminishes the generalization capabilities of FM.However, regardless of the approach being used, it is possible that the lack of examples in the comple-

mentary region does not allow to obtain an output to be compared with the analyzed region. How to dealwith this situation depends on the approach we adopt:• In CA, the problem arises when some input vectorxjc does not fire any rule in the complementaryregion. In that case, we will provide an output equal to the one for its commutatively related inputxj ,i.e., we setyRB(x

jc ) = yRB(xj ). That way, these points will present an error equal to zero for any rule

evaluated in the analyzed region, thus relegating the result of the measurement only to the significantpoints.

• In FA, the problem arises when no rule is fired from the complementary fuzzy inputLX c, providing avoid fuzzy output with minimummembership degree in all its domain.Without any special treatment,the commutativity measure will favor rules in the analyzed region generating fuzzy outputs similar tothe void fuzzy set, whereas it seems more adequate not to assure anything about the commutativity inthese cases. Therefore, in such cases an average commutativity degree 0.5 will be considered for theevaluated rule.

• In EBA, the problem arises when no example covers the complementary region. In such a case, again,a commutativity degree equal to 0.5 will be assigned to the evaluated rule.

Example 2. Supposeweare trying to determine the best suited rule according to its commutativity degreefor the input region[NL1, Z2], considering the fuzzy domains in Fig.3 and with respect to the following

Page 12: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

576 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

-1 -0.4 -0.2 0 0.2 0.4 1

-1

-0.8

-0.4

0

0.4

0.8

1

X2

X1

Fig. 6. Input space location of the training set in Example2.

set of examples:

([+0.026,−0.278],−0.126) ([−0.079,−0.327],−0.203)([−0.299,−0.653],−0.476) ([−0.810,−0.828],−0.819)([−0.133,−0.213],−0.173) ([+0.418,+0.609],+0.514)([−0.768,−0.978],−0.873) ([−0.844,−0.534],−0.689)([−0.261,+0.868],+0.303) ([−0.933,−0.546],−0.740)([−0.616,+0.572],−0.022) ([−0.057,−0.179],−0.118)([−0.710,−0.761],−0.736) ([+0.436,+0.269],+0.352)([+0.323,+0.725],+0.524) ([−0.136,−0.684],−0.410)([−0.108,+0.202],+0.047) ([+0.017,−0.765],−0.374)([+0.056,+0.252],+0.154) ([+0.146,+0.670],+0.408)

which have been obtained randomly from the commutative systemy = x1+x22 and are located in the input

space as shown in Fig.6.Using the MM, the partialRBE shown in Table 1 is identified for the input regions involved in the

calculation (circled and squared rules in Fig. 1). It can be observed that no rule is identified in theanalyzed region[NL1, Z2], since no example covers it. Finally, the rulesR

1,3NL,R

1,3NS ,R

1,3Z ,R1,3PS , andR

1,3PL

are inserted in turn into the rule base and their commutativity degrees aremeasured. ForFA, the similarity

Page 13: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 577

Table 1Partial rule base from certainty degrees

X1

NL1 NS1 Z1 PS1 PL1NL2 NL NS NS NS ?

NS2 NL NS NS NS ?

X2 Z2 ? Z Z PS PS

PS2 Z Z

PL2

measureS2 is selected again. The results for the three approaches are the following:

R1,3NL R

1,3NS R

1,3Z R

1,3PS R

1,3PL

C1,2E (·, E) 0.902 0.959 0.873 0.708 0.630

CE1,2

(·, E) 0.689 0.798 0.712 0.723 0.724

CE1,2

(·, E) 0.864 0.940 0.931 0.818 0.764

All the approaches provide themaximumdegree for the rule with consequentNS, which is predominantin the complementary region. It must be stressed that the consideration of the commutativity has beenallowed to select a rule in a region not covered by any example in the training set.

5. Commutativity into fuzzy modeling

Next, we will focus on the mechanism to integrate the commutativity as prior knowledge in FM. Wewill select the MM presented in Section4.3 as the underlying FM method, a simple method that willallow to analyze clearly the contribution achieved by the integrated prior knowledge.Along FM with the MM, it is possible to draw a certainty mapW ∈ [0,1]n+1 associating a certainty

degree to each possible rule.6 Analogously, it is possible to draw a commutative mapC ∈ [0,1]n+1 fromthe training set through the commutativity measures presented in Section 4.3 by inserting each possiblerule in the identified rule baseRBE and measuring its commutativity degree.Then, a suitability mapS ∈ [0,1]n+1 can be obtained by means of some aggregation mechanism (e.g.,

a convex sum) that combines the certainty and the commutativity degrees of each rule. Finally, from thissuitability mapS, the rules with maximum degree in each input region can be selected to obtain a finalrule base.Therefore, the proposed CS-FM algorithm will consist of the following four steps:

(1) Obtain the certainty mapW from the examples using MM.(2) Obtain the commutativity mapC using a measure of the commutativity of a fuzzy rule with respect

to a set of examples (Section4.3).

6 Since the values provided by (7) lie in [0,1] and the ones provided by (8) lie in [−1,1], in order to obtain rules withhomogeneous degrees, (8) will be normalized to [0,1] by the simple transformation�IN (R

i1...inLY i ) = [�I (R

i1...inLY i ) + 1]/2.

Page 14: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

578 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

(3) Obtain the suitability mapS fromW andC by means of a convex sum,S = �W + (1− �)C (13)

(4) For each input region, select the rule with maximum degree inS.It must be noted that theRBE required to compute the commutativity mapC can be obtained directly

from the information computed in step 1 of the algorithm, with subsequent computational saving.Also, it must be stressed that this mechanism has been developed with the main goal of analyzing the

benefits provided by the integration of prior knowledge about system properties in a FMmethod. Becauseof that, a simplemethod has been proposed that allows to contrast clearly the results of a CNS-FMmethodwith respect to a CS-FM one.

6. Experimental results

Several experiments were applied to three numerical examples with increasing complexity. In thissection, the results are compared among the proposed CS-FMmethods and with respect to two CNS-FMmethods.

6.1. Experiment design

The functions considered were theplane functionf1, thebiplane functionf2, and thegeneralizedRastrigin’s function[23] f3:

f1 : [−1,1] × [−1,1] → [−1,1], y = (x1 + x2)/2, (14)

f2 : [−1,1] × [−1,1] → [−1,1], y = 1− |x1 − x2|, (15)

f3 : [0,0.5] × [0,0.5] → [−2,2.4], y = x1 + x2 − cos(18x1) − cos(18x2) (16)

whose output surfaces are depicted in Fig.7.The input fuzzy domains, shown in Fig. 8a, were the same for all the functions.7 For the output variable

two different fuzzy domains were considered: one involving symmetrical fuzzy sets (Fig. 8b) and anothercomposed of asymmetrical fuzzy sets (Fig. 8c). The intention was to analyze the effect of the shape ofthe output fuzzy sets in the performance of the commutativity measures.The experiments involved five FM methods: two CNS-FM methods and three CS-FM ones. In the

first group, the WWM and the MM described in Section 4.3 were considered. The former is a methodfrequently used for comparison purposes due to its simplicity and at the same time its good performance,therefore allowing to extrapolate the comparative analysis to other methods from the literature. The latterwill allow to establish the contribution of commutativity analysis to FM, as it is the method used to drawthe certaintymapW in all CS-FMmethods. The second group corresponds to CA, FA, and EBA proposedin this paper. We considered� = 0.5 to combineW andC in (13) and the similarity measureS2 for FA.Six training set sizes were considered: 5, 10, 15, 20, 30, and 50 examples, trying to range from poor

to significant informative training sets. For each training set size, each method was run 200 times with

7As each function has its own domain of discourse, the appropriate scaling factors were applied to each fuzzy domain.

Page 15: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 579

-1

-0.5

0

0.5

1-1

-0.5

0

0.5

1

-1

0

1

X2 X1

(a)

X2 X1

-1

-0.5

0

0.5

1-1

-0.5

0

0.5

1

-1

0

1

(b)

X2 X1

0

0.1

0.20.3

0.4

0.50

0.1

0.2

0.3

0.4

0.5

-2

0

2

(c)

Fig. 7. System surfaces: (a) plane function,f1; (b) biplane function,f2; (c) generalized Rastrigin’s function,f3.

different randomly generated training sets8 and the model error was measured using a test setT = {e1, . . . , et } and the normalized mean square error:

MSEN =∑t

j=1[yjN − yN (xj )]2

t, (17)

whereyjN is the normalized output of thejth example in the test set andyN (xj ) is the normalized output

predicted by the model. The test set consisted in each evaluation in 2500 examples randomly generatedfrom the corresponding function.Besides, an activity index was worked out for each run of the CS-FM methods as the number of rules

it modified with respect to the model identified with MM (i.e., taking into account onlyW).

8 The same 200 training sets were used for the 5 methods, so that theith run of every method used the same training set.

Page 16: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

580 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

0 0.05 0.2 0.275 0.35 0.425 0.5 0.575 0.65 0.725 0.8 0.95 10

0.5

1

X1

0 0.05 0.2 0.275 0.35 0.425 0.5 0.575 0.65 0.725 0.8 0.95 10

0.5

1

X2

0 0.167 0.333 0.5 0.667 0.833 10

0.5

1

Y

0 0.275 0.425 0.5 0.575 0.725 10

0.5

1

Y

(a)

(b)

(c)

Fig. 8. (a) Input fuzzy domains; (b) output fuzzy domain with symmetrical fuzzy sets; (c) output fuzzy domain with asymmetricalfuzzy sets.

6.2. Results and analysis

Fig.9 represents the accuracy results for the functions with both output fuzzy set shapes—symmetricaland asymmetrical—, each graphical depicting the averagedMSEN of eachmethod versus the training setsizes. Table 2 shows the numerical results, detailing for each group of experiments the averagedMSEN—the best highlighted for each training set size—and, in the case of CS-FM, their averaged activity indexes(in parentheses) and the relation of superiority among them (last three columns).CS-FM outperforms CNS-FM for all the proposed measures, specially for small training sets which

have a low informative content. BetweenCS-FMmethods, FAandCApresent in general the best accuracyresults, the best performance being provided by FA for small training sets and by CA for larger ones. Thegeneral poor results of EBA reveal an overvaluation of the information provided by the examples, unableto guide the search of the best commutative rule if they are not numerous enough.Nevertheless, analyzing each function separately, it can be observed that the superiority of FA and CA

over EBA diminishes with the complexity of the system. The reason for this is that FA and CA establishthe commutativity based on the information contained in the complementary region of the preliminaryrule baseRBE . Thus, as the system complexity increases the correctness ofRBE diminishes and withit also the correctness of the final model, since it tries to reach a maximum commutativity degree withrespect toRBE . However, EBA obtains the information in the complementary region directly from theexamples and, therefore, it does not depend on the accuracy of any preliminary rule base. Even so, thegain obtained by FA and CA over EBA inf1 andf2 is quite higher than the one obtained by EBA overFA and CA inf3. In fact, FA reduces to roughly a half the error provided by EBA for small training sets

Page 17: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 581

5 10 15 20 30 500

0.02

0.04

0.06

0.08

0.1

0.12

f2

SYMMETRICAL output fuzzy sets

MS

EN

Training set size

WMM MM MM+EBAMM+FA MM+CA

5 10 15 20 30 500

0.02

0.04

0.06

0.08

0.1

0.12

ASYMMETRICAL output fuzzy sets

MS

EN

Training set size

WMM MM MM+EBAMM+FA MM+CA

5 10 15 20 30 500

0.02

0.04

0.06

0.08

0.1

0.12

f1 M

SE

N

Training set size

WMM MM MM+EBAMM+FA MM+CA

5 10 15 20 30 500

0.02

0.04

0.06

0.08

0.1

0.12

MS

EN

Training set size

WMM MM MM+EBAMM+FA MM+CA

5 10 15 20 30 500

0.02

0.04

0.06

0.08

0.1

0.12

f3 M

SE

N

Training set size

WMM MM MM+EBAMM+FA MM+CA

5 10 15 20 30 500

0.02

0.04

0.06

0.08

0.1

0.12

MS

EN

Training set size

WMM MM MM+EBAMM+FA MM+CA

Fig. 9. Accuracy results.

in f1 (103.2% in the best case), whereas the superiority of EBA inf3 is poorly significative (8.1% in thebest case).Analogously, when comparing FA with respect to CA, the former outperforms the later in a greater

extent than the contrary. In the best case, FA outperforms CA 25.6%, whereas CA only reaches 7.4%.The shape of the output fuzzy sets—symmetrical or asymmetrical—affects, in general, CS-FM errors

in the same way that affects MM errors (i.e.,RBE errors). That is, if the asymmetrical output fuzzy setsreduce (or increase) the accuracy ofRBE obtained with symmetrical output fuzzy sets for a functionand training set size, then, in general, they also reduce (or increase) the final model obtained with anyCS-FM method. For example, withf1, the asymmetrical output fuzzy sets improve the MM model form = 5 and worsen it for the remainder, coinciding exactly with the case for FA and CA, and roughly withEBA (less sensitive toRBE accuracy, as mentioned before). The same happens withf3. However, with

Page 18: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

582 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

Table 2Numerical results: accuracy and activity indexesm WMM MM MM+EBA MM+FA MM+CA FA vs. FA vs. CA vs.

EBA CA EBAPLANE function,f1, with SYMMETRICAL output fuzzy sets05 0.0826 0.0598 0.0421 ( 9.32) 0.0227(27.00) 0.0285 (23.30) 85.5% 25.6% 47.7%10 0.0565 0.0301 0.0193 (10.40) 0.0095(22.72) 0.0106 (21.01) 103.2% 11.6% 82.1%15 0.0391 0.0172 0.0109 ( 9.46) 0.0059(17.11) 0.0062 (16.34) 84.7% 5.1% 75.8%20 0.0272 0.0111 0.0065 ( 8.03) 0.0042(12.77) 0.0043 (12.30) 54.8% 2.4% 51.2%30 0.0159 0.0059 0.0042 ( 5.22) 0.0031( 7.34) 0.0031( 7.16) 35.5% 0.0% 35.5%50 0.0069 0.0027 0.0025 ( 2.23) 0.0023( 2.73) 0.0023( 2.67) 8.7% 0.0% 8.7%

PLANE function, f1, with ASYMMETRICAL output fuzzy sets05 0.0819 0.0571 0.0395 ( 9.26) 0.0221(27.25) 0.0277 (23.41) 78.7% 25.3% 42.6%10 0.0576 0.0319 0.0195 (10.33) 0.0101(22.67) 0.0118 (20.79) 93.1% 16.8% 65.3%15 0.0388 0.0179 0.0108 ( 9.29) 0.0069 (16.68) 0.0067(15.70) 56.5% −3.0% 61.2%20 0.0296 0.0132 0.0086 ( 7.99) 0.0060(13.01) 0.0061 (12.02) 43.3% 1.7% 41.0%30 0.0165 0.0070 0.0051 ( 5.24) 0.0047 ( 7.64) 0.0046( 6.96) 8.5% −2.2% 10.9%50 0.0081 0.0042 0.0038 ( 2.38) 0.0038 ( 3.37) 0.0037( 2.85) 0.0% −2.7% 2.7%

BIPLANE function, f2, with SYMMETRICAL output fuzzy sets05 0.1116 0.0839 0.0623 ( 9.46) 0.0446(27.72) 0.0484 (23.72) 39.7% 8.5% 28.7%10 0.0814 0.0505 0.0340 (10.78) 0.0227(22.89) 0.0245 (21.30) 49.8% 7.9% 38.8%15 0.0578 0.0312 0.0206 ( 9.70) 0.0147(17.11) 0.0147(16.41) 40.1% 0.0% 40.1%20 0.0443 0.0221 0.0152 ( 7.93) 0.0122 (13.07) 0.0120(12.73) 24.6% −1.7% 26.7%30 0.0273 0.0138 0.0104 ( 5.55) 0.0090 ( 7.97) 0.0087( 7.75) 15.6% −3.4% 19.5%50 0.0126 0.0074 0.0063 ( 2.50) 0.0065 ( 3.48) 0.0062( 3.45) −3.2% −4.8% 1.6%

BIPLANE function, f2, with ASYMMETRICAL output fuzzy sets05 0.1136 0.0839 0.0622 ( 8.91) 0.0475(27.02) 0.0514 (22.92) 30.9% 8.2% 21.0%10 0.0796 0.0494 0.0305 (10.58) 0.0229(22.66) 0.0246 (21.08) 33.2% 7.4% 24.0%15 0.0569 0.0318 0.0188 ( 9.59) 0.0149(17.14) 0.0151 (16.33) 26.2% 1.3% 24.5%20 0.0432 0.0220 0.0145 ( 7.80) 0.0131 (12.96) 0.0124(12.32) 10.7% −5.6% 16.9%30 0.0279 0.0139 0.0101 ( 5.69) 0.0100 ( 8.40) 0.0097( 7.89) 1.0% −3.1% 4.1%50 0.0141 0.0089 0.0078( 2.48) 0.0080 ( 3.77) 0.0079 ( 3.29) −2.6% −1.3% −1.3%GENERALIZED RASTRIGIN function, f3, with SYMMETRICAL output fuzzy sets05 0.1031 0.0947 0.0810 ( 9.27) 0.0761(27.01) 0.0794 (23.40) 6.4% 4.3% 2.0%10 0.0819 0.0704 0.0577 (10.63) 0.0542 (22.50) 0.0534(21.21) 6.5% −1.5% 8.1%15 0.0651 0.0524 0.0418 ( 9.85) 0.0422 (16.72) 0.0405(16.74) −1.0% −4.2% 3.2%20 0.0543 0.0428 0.0337 ( 8.81) 0.0342 (13.14) 0.0327(13.22) −1.5% −4.6% 3.1%30 0.0391 0.0297 0.0238( 5.77) 0.0257 ( 7.58) 0.0244 ( 8.03) −8.0% −5.3% −2.5%50 0.0244 0.0183 0.0161( 3.16) 0.0174 ( 3.86) 0.0162 ( 4.33) −8.1% −7.4% −0.6%GENERALIZED RASTRIGIN function, f3, with ASYMMETRICAL output fuzzy sets05 0.1022 0.0929 0.0769 ( 9.31) 0.0739(27.35) 0.0772 (23.67) 4.1% 4.5% −0.4%10 0.0825 0.0725 0.0551(10.65) 0.0572 (22.68) 0.0580 (21.45) −3.8% 1.4% −5.3%15 0.0682 0.0563 0.0417( 9.57) 0.0446 (17.02) 0.0439 (16.93) −7.0% −1.6% −5.3%20 0.0566 0.0444 0.0335( 8.30) 0.0356 (13.15) 0.0346 (13.04) −6.3% −2.9% −3.3%30 0.0412 0.0315 0.0248( 5.76) 0.0265 ( 7.71) 0.0262 ( 8.05) −6.9% −1.1% −5.6%50 0.0270 0.0210 0.0177( 3.13) 0.0191 ( 4.50) 0.0179 ( 4.23) −7.9% −6.7% −1.1%

f2 some exceptions can be appreciated also for FA and CA which seem to indicate that asymmetricaloutput fuzzy sets slightly degrade the performance of these two approaches. In fact, this effect is reflectedin the significance of the differences among CS-FM, since FA and CA diminish their superiority overEBA when using asymmetrical output fuzzy sets. Nevertheless, such effect is not relevant enough tosignificantly alter the supremacy among the methods.Concerning the activity indexes, they seem to be independent from the system identified and from the

output fuzzy set shapes, and do not clearly influence the model accuracy. The comparison among the

Page 19: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 583

5 10 15 20 30 50 1000

0.5

1

1.5

2

2.5

3

Tim

e (in

sec

onds

)

Training set size

WMM MM MM+EBAMM+FA MM+CA

Fig. 10. Computational cost.

CS-FM methods presents similar results in all the functions, being higher for FA and lowerfor EBA.Finally, Fig. 10 presents the results concerning computational cost.9 An additional training set size

has been included to better appreciate the tendency of each method. The results are shown only forone function and one output fuzzy set shape, as all the cases provide similar results. Obviously, CS-FMincreases the computational cost of CNS-FM, but different behaviors are observed among the approaches:• The CA is an expensive method, due to the high number of inferences required to obtain the ap-proximation to the output surfaces to be compared. As mentioned before, it suffers highly from thecurse of dimensionality and, therefore, in spite of its good performance, it could be unfeasible in theidentification of high-dimensional systems. However, its computational cost does not depend on thetraining set size excepting to obtain the certainty mapW (andRBE from it). Moreover, it presentsthe same trend as MM, which decreases its computational cost when training set size increases due toits approximation to WMM.

• The cost of EBA depends on the training set size, due to the intensive use of the examples during theevaluation of the commutativity. Therefore, it could be suitable for small training sets, although itslow accuracy performance in such cases limits in general its application.

• The FA presents the best computational cost in general. It only depends on the training set size toobtainW and the dimensionality of the system affects to obtainC (as more measurements are needed)but it does not (or slightly) affect each single measurement.

7. Conclusions

In this paper, the integration of prior knowledge about properties complied by the system has beenconsidered, focusing on the commutativity as a first approach.

9The programs were coded using Matlab and were run on a PC with Pentium IV-1,7 CPU and 512 MB RAM.

Page 20: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

584 P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585

Themeasurement of commutativity degreesbetween isolated fuzzy rules,while being formally feasible,does not allow to grasp the existing relation among the rules in the fuzzy model and, therefore, is notuseful for our goal. In short, we need to measure the commutativity between input fuzzy regions insteadof between isolated fuzzy rules.The three proposed approaches outperform the results obtained fromNCS-FMmethods. In general, FA

presents the best results, improving to a great extent themodel accuracy with an acceptable computationalcost. The CA will be suitable for large training set sizes, low demanding computational requirements andsystems with low/medium complexity. The EBAmust be relegated to highly complex systems with largetraining sets when no computational restrictions exist.Regardless of the approach being used, the benefits rise with small or low informative training sets,

since it is in such cases where the prior knowledge contributes in a decisive way to the FM results. Thus,these are the situations where the ideas proposed in this paper will play a more significant role.

References

[1] Y.S. Abu-Mostafa, Learning from hints in neural networks, J. Complex. 6 (1990) 192–198.[2] Y.S. Abu-Mostafa, Financial market applications of learning from hints, in: A. Refenes (Ed.), Neural Networks in the

Capital Markets, Wiley, NewYork, NY, USA, 1995, pp. 221–232 (Chapter 15).[3] Y.S. Abu-Mostafa, Hints, Neural Comput. 7 (1995) 639–671.[4] R. Babuška, Fuzzy Modeling for Control, Kluwer Academic Publishers, Boston, MA, USA, 1998.[5] P. Carmona, J.L. Castro, J.M. Zurita, Contradiction sensitive fuzzy model-based adaptive control, Internat. J. Approx.

Reasoning 30 (2) (2002) 107–129.[6] P. Carmona, J.L. Castro, J.M. Zurita, Conmutativity as prior knowledge in fuzzy modeling, in: T. Bilgiç, B.D. Baets, O.

Kaynak (Eds.), Lecture Notes in Computer Science, vol. 2715, Springer, Berlin, Germany, 2003, pp. 620–627; Fuzzy Setsand Systems—IFSA 2003: Proc. of the 10th Internat. Fuzzy Systems Association World Congr. Istanbul, Turkey, June30–July 2, 2003.

[7] P. Carmona, J.L. Castro, J.M. Zurita, Strategies to identify fuzzy rules directly from certainty degrees: A comparison anda proposal, IEEE Trans. Fuzzy System 12 (5) (2004) 631–640.

[8] P. Carmona, J. M. Zurita, Sobre la consistencia en la identificaciòn difusa de sistemas (ideas preliminares), in: Actas DelX Congreso ESTYLF’00, Sevilla, Spain, 2000, pp. 515–519 (in Spanish).

[9] E. Castillo, A. Cobo, J. M. Gutiérrez, R.E. Pruneda, Functional Networks with Applications. A Neural-Based Paradigm,The Kluwer International Series in Engineering and Computer Science, vol. 473, Kluwer Academic Publishers, Boston,MA, USA, 1998.

[10] J.L. Castro, Fuzzy logic controllers are universal approximators, IEEE Trans. System Man Cybernet. 25 (4) (1995) 629–635.

[11] G. Chen, D. Zhang, Back-driving a truck with suboptimal distance trajectories: A fuzzy logic control approach, IEEETrans. Fuzzy Systems 5 (3) (1997) 369–380.

[12] P.J. Costa Branco, J.A. Dente, Fuzzy systems modeling in practice, Fuzzy Sets and Systems 121 (1) (2001) 73–93.[13] H. Hellendorn, D. Driankov (Eds.), Fuzzy Model Identification, Springer, Berlin, Germany, 1997.[14] H. Ishibuchi, K. Nozaki, N. Yamamoto, H. Tanaka, Construction of fuzzy classification systems with rectangular fuzzy

rules using genetic algorithms, Fuzzy Sets and Systems 65 (1994) 237–253.[15] K.Kohara,T. Ishikawa,Multivariatepredictionusingprior knowledgeandneural heuristics, in:Proc. ISIKNH’94,Pensacola

Beach, FL, USA, 1994, pp. 179–188.[16] C.C. Lee, Fuzzy logic in control systems: Fuzzy logic controller, part II, IEEE Trans. SystemMan Cybernet. 20 (2) (1990)

419–435.[17] P. Niyogi, F. Girosi, T. Poggio, Incorporating prior information in machine learning by creating virtual examples, Proc.

IEEE 86 (11) (1998) 2196–2209.[18] C.W. Omlin, C.L. Giles, Training second-order recurrent neural networks using hints, in: D. Sleeman, P. Edwards (Eds.),

Proc. 9th Internat. Workshop Machine Learn, Morgan Kaufmann, San Mateo, CA, USA, 1992, pp. 363–368.

Page 21: Commutativity as prior knowledge in fuzzy modelinghera.ugr.es/doi/15771143.pdf · Available online 30 November 2004 Abstract In fuzzy modeling (FM), the quantity and quality of the

P. Carmona et al. / Fuzzy Sets and Systems 152 (2005) 565–585 585

[19] S.J. Perantones, P.J.G. Lisboa, Translation, rotation and scale-invariant pattern recognition by high-order neural networksand moment classifiers, IEEE Trans. Neural Networks 3 (2) (1992) 241–251.

[20] M. Setnes, Fuzzy rule-base simplification using similarity measures, Master’s Thesis, Delft University of Technology,Dept. Electron. Eng., Control Laboratory, Delft, The Netherlands, July 1995.

[21] J. Sill, Y.S. Abu-Mostafa, Monotonicity hints, in: Proc. NIPS’96, Denver, CO, USA, 1996, pp. 634–640.[22] A. Suddarth, A.D.C. Holden, Symbolic-neural systems and the use of hints for developing complex systems, Internat. J.

Man-Mach (1991) 291–311.[23] A. Törn, A. Zıˇlinskas, Global Optimization, Lecture Notes in Computer Science, vol. 350, Springer, Berlin, Germany,

1989.[24] L.X. Wang, J.M. Mendel, Generating fuzzy rules by learning from examples, IEEE Trans. Systems Man Cybernet. 22 (6)

(1992) 1415–1427.[25] R. Zwick, E. Carlstein, D.V. Budescu, Measures of similarity among fuzzy concepts: A comparative analysis, Internat. J.

Approx. Reasoning 1 (1987) 221–242.