integrating rough set theory and fuzzy neural network to discover fuzzy rules

Upload: rajat287

Post on 10-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    1/16

    Intelligent Data Analysis 7 (2003) 5973 59IOS Press

    Integrating rough set theory and fuzzy neuralnetwork to discover fuzzy rules

    Shi-tong Wanga, Dong-jun Yub and Jing-yu YangbaDepartment of Computer Science, School of Information, Southern Yangtse University, Jiangsu, P.R.

    China, 214036bDepartment of Computer Science, Nanjing University of Science & Technology, Nanjing, Jiangsu, P.R.

    China 210094

    Received 15 April 2002

    Revised 15 June 2002

    Accepted 25 June 2002

    Abstract. Most of fuzzy systems use the complete combination rule set based on partitions to discover the fuzzy rules, thusoften resulting in low capability of generalization and high computational complexity. To large extent, the reason originatesfrom the fact that such fuzzy systems do not utilize the field knowledge contained in data. In this paper, based on rough settheory, a new generalized incremental rule extraction algorithm (GIREA) is presented to extract rough domain knowledge,namely, certain and possible rules. Then, fuzzy neural network FNN is used to refine the obtained rules and further produce thefuzzy rule set. Our approach and experimental results demonstrate the superiority in both rules length and the number of fuzzyrules.

    Keywords: Rough set, fuzzy set, neural networks, incremental rule extraction

    1. Introduction

    In real world, almost every question will finally lead to process data that has characteristics ofuncertainty, imprecision. To date, many scholars have developed all kinds of approaches, such as neural

    network [1], fuzzy systems [2], rough set theory [3], genetic algorithm etc. Each approach has its own

    advantages and disadvantages. In order to provide more flexible and robust information processingsystem, using only one approach is not enough. There is already a trend to integrate different computingparadigms such as neural network, fuzzy systems, rough set theory, genetic algorithm and so on togenerate more efficient hybrid systems such as neural-fuzzy systems [4].

    Typically, fuzzy neural network (namely, FNN) embodies both advantages of neural networks (namely,NN) and fuzzy systems. In other words, FNN can be used to construct knowledge-basedNN. i.e. human-beings field knowledge can be incorporated into NN, so FNN can be more suitable for the question tobe solved. But there still exist questions. For example, in some circumstances, people even cant derive

    appropriate rules to a given system. Of course, we can divide every input dimension into several fuzzysubsets, and then all fuzzy subsets in every input dimension are combined to construct the complete ruleset. However, such kind of FNN contains no field knowledge, i.e. this kind of FNN may not fit for thegiven system at the very beginning. Recent years, rough set theory has been attracting more and more

    1088-467X/03/$8.00 2003 IOS Press. All rights reserved

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    2/16

    60 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    attentions and used in various applications, due to its excellent capability of extracting knowledge from

    data. In this paper, we will first apply rough set theory to extract certain and possible rules which thenare used to determine the initial structure of FNN such that the FNN here works in the beginning with

    this type of useful knowledge.As to fuzzy rule extraction, there are two important problems worthy to study. One is how to extract a

    rule set from data. The other is how to refine/simplify the obtained rule set. Several approaches [1] can

    be applied to extract rules from data, such as fuzzy rule extraction based on product space clustering,fuzzy rule extraction based on ellipsoidal covariance learning, fuzzy rule extraction based on directmatching, etc. Fuzzy rule simplification approach [12] based on similarity measure can effectively

    reduce the number of fuzzy rules by merging similar fuzzy sets in fuzzy rules. This paper aims at solving

    the above two problems in a different aspect. The contribution of our approach here mainly exists in

    effectively integrating rough set theory and FNN together to discovery fuzzy rules from data. Concisely,this approach first extracts certain and possible rules from data in an incremental mode by using the new

    generalized incremental rule extraction algorithm GIREA, then applies the FNN to refine/simplify theextracted fuzzy rules.

    This paper are organized as follows: Section II gives a brief description of fuzzy system and FNN.Section III introduces basic concepts of rough set theory. In Section IV, new generalized incrementalfuzzy rule extraction algorithm GIREA is presented. Section V deals with the method of mapping

    fuzzy rule set to the corresponding FNN. Simulation results are demonstrated in Section VI. Section VIIconcludes this paper.

    2. Fuzzy system and its fuzzy neural network

    Generally speaking, a fuzzy system consists of a set of fuzzy rules as follows [5]:

    Rule 1: ifx1 is A11 and x2 is A

    12 and . . . xn is A

    1n, then y is B

    1

    Rule 2: ifx1 is A21 and x2 is A

    22 and . . . xn is A

    2n, then y is B

    2

    ...

    ...

    Rule N: ifx1 is AN1 and x2 is A

    N2 and . . . xn is A

    Nn , then y is B

    N

    Fact: x1 is A

    1 and x2 is A

    2 and . . . xn is A

    n

    Conclusion: y is B .

    With max-product inference and centroid defuzzification, the final output of this fuzzy system can be

    written as:

    y =

    B(y)ydyB(y)dy

    (1)

    where B(y) = x1,x2,...,xn

    ni=1

    AI

    (xi)

    N

    j=1

    ni=1

    Aji

    (xi)

    Bj(y)

    .

    Dr. L.X. Wang [6] has proved that Eq. (1) is a universal approximator.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    3/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 61

    Fig. 1. The FNN implementation of the fuzzy system.

    In practice, one can often consider that the output fuzzy sets Bj are singleton j , i.e.,

    B(y) =

    1, if(y = j), j = 1, 2, . . . , N 0, otherwise, j = 1, 2, . . . , N

    (2)

    thus, we have

    Bj (y) =

    ni=1

    Aji

    (xi), if(y = j),

    0, otherwise

    j = 1, 2, . . . , N (3)

    then the final output can be rewritten as follows:

    y =

    Nj=1

    j

    ni=1

    Aji

    (xi)

    Nj=1

    ni=1

    Aji

    (xi)

    (4)

    The I/O relationship of the fuzzy system defined in Eq. (4) can be implemented by a correspondingFNN. The FNN consists of four components. They are input layer, fuzzification layer, inference layerand defuzzification layer as shown in Fig. 1.

    General speaking, FNN can be utilized in two modes, one is series-parallel mode and the other is

    parallel mode [13,14], see Figs 2(a) and (b), where TDL represents time delayed logic, RS represents thereal system, FNN represents fuzzy neural network, uk is the activation function, yk and yk are outputsof the RS and the FNN, respectively, ek is the difference between yk and yk. Figure 2(a) can be called

    series-parallel mode and Fig. 2(b) parallel mode. When the FNN works in series-parallel mode, all the

    delayed output data (used as the input data of the FNN) are the observation data of the real system. In this

    circumstance, high observation precision is needed; too much observation noise will greatly degrade the

    performance of the FNN. While in parallel mode, all the delayed output data (used as the input data of the

    FNN) are independent to the observation data of the real system, and only relate to the FNN itself. No

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    4/16

    62 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    (a) (b)

    Fig. 2. Two modes that FNN can be applied. (a) series-parallel mode (b) parallel mode.

    matter in which kind of mode, when the FNN approximates the real system well enough, it can be applied

    independently. FNN has been widely used but there still exists a question as we described in Section 1,

    i.e., when there is no any prior field knowledge, how can people get appropriate rules to construct FNNto reduce its searching space and time. The rest parts of this paper try to solve this problem.

    3. Rough set, decision matrix and rule extraction

    3.1. Basic concepts of rough sets

    Here, we just introduce some necessary concepts needed in this paper. For details, please refer to [3].

    An information system K = (U, C D), where U denotes the domain of discourse, C denotes anon-empty condition attribute set, and D denotes a non-empty decision attribute set. Let A = C D,an attribute a(a A) can be regarded as a function from the domain of discourse U to value set V ala.

    An information system may be represented in the form of attribute-value table, in which rows are

    labeled by objects in the domain of discourse, and columns by the attributes.

    For every subset of attributes B C, equivalence relation IB on U can be defined as:

    IB = {(x, y) U : for every a B, a(x) = a(y)} (5)

    thus, the equivalence class of the object x U relative to IB can be defined as:

    [x]B = {y|y U,yBx} (6)

    Equivalence class can also be called indiscernible class, because any two objects in equivalence class

    are indiscernible.

    Low and upper approximation are another two important concepts in rough set theory. Given subsets

    x U, B C, Xs B-Lower and B-Upper approximations can be defined as BX{x U : [x]B X}and BX = {x U : [x]B X = }, respectively. Boundary set BNB(X) can be defined asBNB(X) = BX BX. If BNB(X) = , i.e. BX = BX, then X is B rough, otherwise, X is B

    exact.

    3.2. Rule extraction using decision matrix

    Decision matrix is a generalized form of rough set theory. The concept of decision matrix is derived

    from descernibility matrices [8], it can be used to compute decision rules and reducts of information

    system. It provides a way to generate the simplest set of rules and preserve all classi fication informationsimultaneously [9].

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    5/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 63

    Table 1Consistent information table

    Attributes Decision

    Headache Temperature Flu

    Object1 Yes Normal NoObject2 Yes High YesObject3 Yes Very High YesObject4 No Normal NoObject5 No High NoObject6 No Very High Yes

    Table 2Decision matrix for class 0 (flu infected)

    Class j 1 2 3

    i OBJ Obj1 Obj4 Obj51 Obj2 (T,1) (T,1)(H,0) (H,0)2 Obj3 (T,2) (T,2)(H,0) (T,2)(H,0)3 Obj6 (H,1)(T,2) (T,2) (T,2)

    3.2.1. Rule extraction from consistent information table

    Let us introduce decision matrix first. For an information system K = (U, C D), suppose Ube divided into m classes (c1, c2, . . . , cm) by equivalence relation defined on D. Given any classc (c1, c2, . . . , cm), all objects which belong to and do not belong to this class are numbered withsubscripts i(i = 1, 2, . . . , ) and j(j = 1, 2, . . . , ), respectively. The decision matrix M(K) = (Mij)of information system Kisdefinedasa matrix, whose entry at position (i, j) is a set of attribute-valuepair:

    Mij = {(a, a(i)) : a(i) = a(j)}, (i = 1, 2, . . . , ;j = 1, 2, . . . , ), (7)

    where a(i) is a value of attribute a.For a given object i(i = 1, 2, . . . , ) belonging to class c (c1, c2, . . . , cm), we can compute its

    minimal-length decision rule

    |Bi| =jMij, (8)

    where and are generalized conjunction and disjunction operator respectively. So for the given classc (c1, c2, . . . , cm), its decision rule set can be represents as following

    RU L = |Bi|, (i = 1, 2, . . . , ) (9)

    Let H represent Headache, T and F represent Temperature and Flu, respectively.

    V ALH = {0, 1} represents V ALHeadache = {Yes,No}.

    V ALT = {0, 1, 2} represents V ALTewmperature = {Normal,High,Very High}.V ALF = {0, 1} represents V ALFlu = {Yes,No}.Tables 2 and 3 demonstrate the decision matrix for class 0 (Flu infected) and 1 (not infected),

    respectively.

    Let |B0i |(i = 1, 2, 3) denotes the i-th minimal-length rule in decision matrix of class 0.So,

    |B01 | = (T, 1) ((T, 1) (H, 0)) (H, 0) = (T, 1) (H, 0)

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    6/16

    64 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    Table 3Decision matrix for class 1 (flu not infected)

    Class j 1 2 3

    i OBJ Obj2 Obj3 Obj6

    1 Obj1 (T,0) (T,0) (H,0)(T,0)2 Obj4 (T,0)(H,1) (T,0)(H,1) (T,0)3 Obj5 (H,1) (H,1)(T,1) (T,1)

    Table 4Inconsistent information table

    Attributes Decision

    Headache Temperature Flu

    Object1 Yes Normal NoObject2 Yes High YesObject3 Yes Very High YesObject4 No Normal NoObject5 No High NoObject6 No Very High YesObject7 No High YesObject8 No Very High No

    |B02 | = (T, 2) ((T, 2) (H, 0)) ((T, 2) (H, 0)) = (T, 2)

    |B03 | = ((T, 2) (H, 1)) (T, 2) (T, 0) = (T, 2)

    Similarly, the i-th minimal-length rule in decision matrix of class 1 can be compute as following:

    |B11 | = (T, 0) (T, 0)((T, 0) (H, 0)) = (T, 0)

    |B12 | = ((T, 0) (H, 1)) ((T, 0) (H, 1)) (T, 0) = (T, 0)

    |B13 | = (H, 1) ((T, 1) (H, 1)) (T, 1) = (T, 1) (H, 1)

    The final minimal-length decision rule set for class 0 and class 1 can be represented as

    RU L0 = (T, 2) ((T, 1) (H, 0))

    RU L1 = (T, 0) ((T, 1) (H, 1))

    3.3. Rule extraction from inconsistent information table using decision matrix

    In real-life applications, consistent information table often does not exist, so, inconsistent information

    has to be coped with.

    Suppose we add Object 7 and Object 8 into Table 1 and then get Table 3. Table 3 is an inconsistent

    information table for there exist some Objects that have the same condition attribute value and whosecorresponding decision attribute values are different. For example, Object5 and Object7 have the same

    condition attribute value, but they have different decision attribute values.

    From Table 3, we can get two concepts X1 = {Object2, Object3, Object6, Object7} and X2 ={Object1, Object4, Object5, Object8}, representing flu infected and flu not infected, respectively. Thesetwo concepts are rough because neither of them is de finable. In order to extract rules from inconsistentinformation table, low and upper approximations are needed. Rules extracted from low approximation

    are certain rules. Rules extracted from upper approximation are possible rules.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    7/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 65

    Table 5Decision matrix for computing concept X1s certain rules

    Class j 1 2 3 4 5 6

    i Object Object1 Object4 Object5 Object6 Object7 Object8

    1 Object2 (T,1) (H,0)(T,1) (H,0) (H,0)(T,1) (H,0) (H,0)(T,1)2 Object3 (T,2) (H,0)(T,2) (H,0)(T,2) (H,0) (H,0)(T,2) (H,0)

    Table 6Decision matrix for computing concept X1s pos-sible rules

    Class j 1 2

    i Object Object1 Object41 Object2 (T,1) (T,1)(H,1)2 Object3 (T,2) (T,2)(H,1)3 Object5 (H,1)(T,1) (T,1)4 Object6 (H,1)(T,2) (T,2)5 Object7 (H,1)(T,1) (T,1)6 Object8 (H,1)(T,2) (T,2)

    Firstly, we compute concept X1 and X2s low and upper approximation:

    BX1 = {Object2, Object3}

    BX2 = {Object1, Object4}

    BX1 = {Object2, Object3,Object5, Object6,Object7, Object8}

    BX2 = {Object1, Object4,Object5, Object6,Object7, Object8}

    Let |B0i |certain(i = 1, 2) denote the i-th minimal-length certain rule in decision matrix of class 0.Using method proposed in Section 4.1, we can compute certain rules for concept X1 (class 0 ) as

    follows:

    |B01 |certain = (T, 1) ((T, 1) (H, 0)) (H, 0) ((T, 1) (H, 0)) (H, 0) ((T, 1) (H, 0))

    = (T, 1) (H, 0)

    |B02 |certain = (T, 2) ((T, 2) (H, 0)) ((T, 2) (H, 0)) (H, 0) ((T, 2) (H, 0)) (H, 0)

    = (T, 2) (H, 0)

    thus, we obtain certain rule set for class 0:

    RU L0certain = ((T, 1) (H, 0)) ((T, 2) (H, 0))

    In order to obtain certain rules, we define its belief function df = 1. In other words, rules with df = 1

    are positively believable.Let |B0i |possible denote the i-th minimal-length certain rule in decision matrix of class 0. Similarly, we

    can use the same method to compute possible rules for concept X1 from Table 6 as follows:

    |B01 |possible = (T, 1) ((T, 1) (H, 1)) = (T, 1),

    |B02 |possible = (T, 2) ((T, 2) (H, 1)) = (T, 2)

    |B03 |possible = ((T, 1) (H, 1)) (T, 1)) = (T, 1),

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    8/16

    66 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    |B04 |possible = ((T, 2) (H, 1)) (T, 2) = (T, 2)

    |B05 |possible = ((T, 1) (H, 1)) (T, 1)) = (T, 1),

    |B0

    6 |possible = ((T, 2) (H, 1)) (T, 2) = (T, 2)

    thus, we can obtain possible rule set for class 0 as follows

    RU L0possibel = (T, 1) (T, 2) (T, 1) (T, 2) (T, 1) (T, 2)

    = (T, 1) (T, 2)

    For possible rules, we define their belief function

    df = 1 card(BX BX)

    card(U)

    where card() denotes the cardinality of the set. In other words, possible rules are believable with degreedf, 0 < df < 1. The rationale of this definition is intuitive: The more the difference between BX andBX is, the more inexact the concept X is, thus the belief degree of the possible rules extracted from X

    should be decreased accordingly. When BX approaches to BX, df will approach to 1.

    Similarly, we can compute concept X2s certain and possible rules.

    4. New Generalized Incremental Rule Extracting Algorithm (GIREA)

    Suppose we have extracted certain and possible rules from an information table, when new objects are

    added into it, the rule set may be changed. In this circumstance, incremental rule extraction algorithm is

    required; otherwise it will take much more long time to re-compute rule set from the very beginning. It

    should be pointed out that the incremental rule extraction algorithm in [9] did not compute certain andpossible rules and cope with consistent information table simultaneously. However, the new generalized

    incremental rule extraction algorithm (GIREA) is presented here, which can not only deal with both

    consistent and inconsistent information table, but also it can extract certain and possible rule sets at the

    same time, although GIREA is a generalization of the algorithm presented in [9]. The main idea of this

    new algorithm can be summarized as follows:

    Given a new added Object:

    Whether this new added Object causes a new concept or not? If it does, update concept set.

    Collision detection: Objecta collides with Objectb, if and only if Objecta and Objectb have the

    same condition attribute values, and their corresponding decision attribute values are different. For

    example, Object6 and Object8 collide with each other (in Section 4.3).

    Update certain and possible rule sets in terms of collision detection.Using this algorithm, when a new object is added up to information system, it is unnecessary to

    re-compute rule sets from the very beginning, we can update rule sets by partly modifying original rule

    sets, so a lot of time are saved, it is especially useful when extracting rules from large databases.

    GIREA Algorithm:

    Condition: Rule set and concept set (X = {X1, X2, . . . , X }) which have been computed from thegiven information system. A new object Objectnew is added up to information system.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    9/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 67

    BEGIN

    STEP 1.

    Determine which concept the new added object belongs to, if it does not belong to any concept in

    concept set X = {X1, X2, . . . , X }, create a new concept X+1 and add it to X, i.e. X = X {X}STEP 2.

    // Collision detection

    IF (the new object Objectnew collides with original objects in information table)

    FLAG = 1;ELSE

    FLAG = 0;

    STEP 3.

    Get a concept Xi from X, and X = X {Xi}.IF(FLAG = 0) // no collision{

    IF (Val(Xi) = Val(Objnew)){ add up a new row for concept Xis certain and possible decision matrix respectively

    (labeled with 1 and 2 respectively),

    (Mk1j) = {(a, a(k1))|a(k1) = a(j)}(Mk2j) = {(a, a(k2))|a(k2) = a(j)}compute decision rule for the added row respectively:

    |Bk1| =jMk1j

    |Bk2| =jMk2j

    update concept Xis certain and possible rule sets as followsRU Licertain = RU L

    icertain |Bk1|

    RU L

    i

    possible = RU L

    i

    possible |Bk2|}ELSE

    {add a new column for concept s certain and possible decision matrix respectively(labeled with 1 and 2 respectively),

    (Mk1j) = {(a, a(i))|a(i) = a(k1)}(Mk2j) = {(a, a(i))|a(i) = a(k2)}compute decision rule for every row respectively:

    |Bi|certain = |Bi|certain Mik1|Bi|possible = |Bi|possible Mik2update concept Xis certain and possible rule sets as follows

    RU Licertain = |Bi|certain

    RU Lipossible = |B

    i|possible}

    }ELSE //collision detected

    { IF (Objectnew collides with Object which exists in concept Xis low approximation){ delete the row which contains Object from certain decision matrix of concept

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    10/16

    68 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    Xi (labeled with l).Update certain rule set as follows:

    RU Licertain = RU Licertain |Bl|certain.

    Then add a new column to certain decision matrix of conceptXi (labeled with k).

    Update every rows decision rule as follows:|Bi|certain = |Bi|certain MikUpdate final certain rule set as follows:RU Licertain = |B

    i|certainadd a new row to possible decision matrix of concept Xi (labeled with k):(Mkj) = {(a, a(k))|a(k) = a(j)}compute possible decision rule for this line:

    |Bk|possible =jMkj

    update final possible set as follows:RU L

    possible= RU Li

    possible |B

    k|

    possible}ELSE IF(Val(Xi) = Val(Objectnew)){ add a new column to certain decision matrix of concept Xi (labeled with k).

    (Mik) = {(a, a(i))|a(i) = a(k)}update every rows decision rule as follows:|Bi|certain = |Bi|certain Mikupdate final certain rule set as follows:RU Licertain = |B

    i|certaindelete the column which contains Object from possible decision matrix of

    concept Xi and add a new row (Objectnew) to it;calculate each rows possible rule |Bi|possible;

    calculate RU Lipossible as: RU Lipossible = |Bi|possible}

    ELSE

    { add a new column for concept s certain and possible decision matrix respectively(labeled with k1 and k2 respectively),(Mik1 = {(a, a(i))|a(i) a(k1)}(Mik2 = {(a, a(i))|a(i) a(k2)}compute decision rule for the added column respectively:

    |Bi|certain = |Bi|certain Mik1|Bi|possible = |Bi|possible Mik2update concept Xis certain and possible rule sets as followsRU Li

    certain = |Bi|certainRU Lipossible = |B

    i|possible}

    }}

    STEP 4.

    IF (X = )GOTO STEP 3.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    11/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 69

    ELSE

    STOP.

    END

    A question one may raise here is that when a new object is added to the domain of discourse U, thecardinality of U will change, thus the belief degrees of possible rules must be recomputed, this willaffect the entire learned rule set, thereby making the algorithm not incremental. We analyze it as follows:

    according to the definition of belief function in Section 4.3, the belief degrees of possible rules extracted

    from the same concept are equal. When a new object is added, recomputing each concepts belieffunction can get the belief degrees of all possible rules. Moreover, the incrementability of the proposed

    algorithm is acquired by properly modifying the already existing rules; belief degree recomputation is just

    small part work of this kind modification. Compared with the computational cost of rule modification,computational cost of belief degree is rather small.

    5. Mapping rules into the FNN

    When certain and possible rules are extracted from information table, we need to map them into the

    corresponding FNN just like mapping fuzzy rules to FNN, which is described in Section 2.

    Taking the rules extracted in Section 3.2.2 as an example, there are 3 certain rules and 3 possible rules

    in the rule set as follows:

    Certain rules:

    RU L0certain = ((T, 1) (H, 0)) ((T, 2) (H, 0))

    RU L1certain = (T, 0)

    Possible rules:

    RU L0possible = (T, 1) (T, 2)

    RU L1possible = (H, 1)

    We can describe these rules in the form of natural language as follows:

    (1) If Temperature is High And Headache is Yes, Then the Flu is Infected. (df1 = 1)(2) If Temperature is Very High And Headache is Yes, Then the Flu is Infected. (df2 = 1)(3) If Temperature is Normal, Then the Flu is not Infected. (df3 = 1)

    Rules (1), (2) and (3) are certain rules, the belief degrees (df) of which are all 1, i.e., these certain rules

    are definitely believable.

    (4) If Temperature is High, Then the Flu is Infected. (df4 = 0.5)

    (5) If Temperature is Very High, Then the Flu is Infected. (df5 = 0.5)(6) If Headache is No, Then the Flu is not Infected. (df6 = 0.5)

    Rules (4), (5) and (6) are possible rules, the belief degrees (df) of which lie between 0 and 1, i.e., these

    possible rules are partially believable.

    As there are two kinds of rules (certain and possible), thus the inference layer of the corresponding

    FNN consists of two parts as shown in Fig. 3, one is certain part, which contains certain rules, and the

    other is possible part, which contains possible rules.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    12/16

    70 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    the FNN as shown in Fig

    Fig. 3. Mapping rules to FNN.

    Let dfi be the belief degree of the ith rule. The final fitness of the ith rule in FNN can be measured by

    dfi i, where i is the fitness of the ith rule in conventional meaning.Let x be the input variable of Headache, y be the input variable of Temperature, C1 represent flu not

    infected and C2 represent flu infected. Define two fuzzy sets Yes and No on input dimension and

    three fuzzy sets N, H and V on input dimension, where N, H and V represent Normal,

    High and Very High, respectively. Then the six rules described above can be mapped into the FNN

    as shown in Fig. 3.

    6. Numerical simulations

    In this section, numerical simulations are demonstrated to show our approachs superiority over the

    rule extraction approach only using the conventional FNN [1].

    Given a nonlinear system:

    y(t + 1) =y(t)y(t 1)(y(t) + 2.5)

    1 + y2(t) + y2(t 1)+ u(t)

    u(t) = sin2t

    25is activation function. (10)

    y(0) = 0.9, y(1) = 0.5.

    Method 1: Use the conventional FNN [1].

    First, we divide input interval into three equal sub-intervals on each dimension, and then define three

    fuzzy subsets on them (see Fig. 4). Figure 4 shows how to define fuzzy sets on sub-intervals, where

    S, M and L represents fuzzy sets Small, Middle and Large, respectively; y min and ymax are the

    minimum and the maximum that may be taken on dimension y, respectively.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    13/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 71

    Fig. 4. Defining fuzzy sets on y dimension.

    Table 7Performance comparison between method 1 andmethod 2

    R ARL No. of Iterations

    Method 1 27 3 200Method 2 20 2.2 89

    We define the average rule length ARL as:

    ARL =

    Ri=1

    Pi

    R(11)

    where R is the number of rules, Pi is the number of the premise variables in the ith rule.

    Using the complete combination rule set, there will be 27 (3 3 3) rules, and ARL is 3 (becausethere are 3 premise variables in each rule).

    Method 2: Use the approach in this paper, i.e.,

    Discretizing samples (Quantifying continuous attribute value). In order to compare with Method 1,

    input interval is also divided into 3 equal subintervals on each dimension as done in method 1. In order to demonstrate the incrementability of the proposed algorithm GIREA, setting information

    table null at beginning, then gradually add sample into it (one by each time), extracting certain and

    possible rules using GIREA until all samples have been processed.

    Mapping rules to the FNN, using the FNN to refine the rules obtained in the above step

    Using method 2, we got 20 rules and the average rule length is 2.5.

    In our experiment, in order to approximate to the same level, the number of iteration for method 1 and

    method 2 are 200 and 89 respectively. Figure 5 shows the final identification results of method 1 andmethod 2, respectively (using FNN independently when training finished and using different initial statevalues from the real system (y(0) = 0.9, y(1) = 0.5), but the two FNNs use the same initial statevalues (y(0) = 0.4, y(1) = 0.2)). Table 7 compares the performances of method 1 and method 2.

    From Fig. 5 we can see that compared with method 1, method 2 has the simpler rule set, the morequick learning speed. The reason is that the FNN based on our approach here contains knowledge gotfrom sample data.

    Figure 6 also shows the final identification results of method 1 and method 2 after 20% white gaussnoise added respectively. It is easy to see that the FNN based on method 2 has better robustness than the

    FNN based on method 1.

    Here another experiment is done to demonstrate the performance superiority of the proposed GIREA

    over the conventional rule extraction algorithm.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    14/16

    72 S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules

    (a) (b)

    Fig. 5. (a) and (b) are identification results using method 1 and method 2, respectively. Small dots real system (initial statey(0) = 0.9, y(1) = 0.5); Big dots FNN (initial state y(0) = 0.4, y(1) = 0.2).

    (a) (b)

    Fig. 6. (a) and (b) are identification results using method 1 and method 2, respectively. (20% white gauss noise added). Smalldots real system (initial state y(0) = 0.9, y(1) = 0.5); Big dots FNN (initial state y(0) = 0.4, y(1) = 0.2).

    Suppose there are 100 samples in original sample set. Rules have been extracted from the sample set

    using the conventional rule extraction algorithm. Suppose the used time be the benchmark time 1. Now

    another 20 samples are added to the sample set. The time of re-extracting rules using the conventional

    rule extraction algorithm is 1.19, while the time of re-extracting rules using GIREA is only 1.08, as

    shown in Table 8. The reason is that when new objects added, the proposed GIREA updates rule set by

    partly modifying original rule set, while the conventional rule extraction algorithm needs to re-compute

    rule set from the very beginning.

    7. Conclusions

    How to get rules from data without expert knowledge is the bottleneck of knowledge discovery. Our

    approach here attempts to integrate rough set and FNN together to discover knowledge. Rule set obtained

    by GIREA has characteristics of fewer rules and shorter rule length. Simulation results on our approach

    here show its effectiveness and advantages over conventional FNN. The reason is that our approach

    utilizes the distribution characteristics of sample data and extract better rule set, so the FNN based onbetter rule set has better topology and has better robustness and learning speed accordingly. Further

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    15/16

    S.-t. Wang et al. / Integrating rough set theory and fuzzy neural network to discover fuzzy rules 73

    Table 8Performance comparison between the conventional rule ex-traction algorithm and the GIREA (Note: the time listed inTable 8 is relative to the benchmark time 1)

    Algorithm Time usedThe Conventional Rule Extraction Algorithm 1.19GIREA 1.08

    studies should be focused on theoretical and practical study of static-dynamic topology-changeable FNN

    and knowledge discovery.

    Acknowledgement

    The work here is financially supported by National Science Foundation of China. The authors wouldlike to thank the anonymous reviewers for their valuable comments.

    ABOUT AUTHORS

    Wang Shitong: Professor in computer science

    Yu Dongjun, Ph.D candidate in computer science

    Yang JinYu: Professor in computer science

    References

    [1] S.T. Wang, Fuzzy system and Fuzzy Neural Networks, Shanghai Science and Technology Press, 1998, Edition 1.[2] L.A. Zadeh, Fuzzy sets, Inform. Contr. 8 (1965), 338353.[3] Z. Pawlak, Rough Sets, Theoretical Aspects of Reasoning About Data. Dordrecht, Kluwer, The Netherlands, 1991.[4] M. Banerjee et al., Rough Fuzzy MLP: Knowledge Encoding and Classification, IEEE Trans. Neural Networks 9(6)

    (1998), 12031216.[5] C.T. Lin, Neural Fuzzy System, Prentice-Hall Press, USA, 1997.[6] L.X. Wang, A Course on Fuzzy Systems, Prentice-Hall press, USA, 1999.[7] S. Wang and D. Yu, Error analysis in nonlinear system identification using fuzzy system, J. of software research 11(4)

    (2000), 447452.[8] A. Skowron and C. Rauser, The discernability matrices and functions in information system, in Intelligent Decision

    Support, Handbook of Application and Advances of Rough Sets Theory, R. Slowinski, ed., Dordrecht, Kluwer, TheNetherlands, 1992, pp. 331362.

    [9] N. Shan and W. Ziarko, An incremental Learning Algorithm for Constructing Decision Rules, in: Rough Sets, Fuzzy Setsand Knowledge Discovery, R.S. Kluwer, ed., Springer-Verlag, 1994, pp. 326334.

    [10] P. Wang, Constructive theory for fuzzy system, Fuzzy sets and systems 88(2) (1997), 10401045.[11] Z. Mao et al., Topology-Changeable neural network, Control theory and application 16(1), 5460.[12] M. Setnes et al., Similarity measures in fuzzy rule base simplification,IEEETransactions on system, man, and cybernetics

    Part B: cybernetics 28(3) (June 1998).[13] K.S. Narendra and K. Parthasarathy, Identification and control of dynamical systems using neural networks, IEEE Trans.

    Neural Networks 1(1) (March 1990), 423.[14] J. Lu, W. Xu and Z. Han, Research on parallel Identification Algorithm of Neural Networks, Control Theory and

    applications 15(5) (1998), 741745.

  • 8/8/2019 Integrating Rough Set Theory and Fuzzy Neural Network to Discover Fuzzy Rules

    16/16