2014+a+general+framework+for+constructing+possibilistic+membership+functions+in+alternating+cluster+estimation...

International Journal of InnovativeComputing, Information and Control ICIC International c2014 ISSN 1349-4198Volume 10, Number 6, December 2014 pp. 1-13

A GENERAL FRAMEWORK FOR CONSTRUCTING POSSIBILISTICMEMBERSHIP FUNCTIONS IN ALTERNATING CLUSTER

ESTIMATION

Qina Wang1, Jian Zhou1 and Chih-Cheng Hung2;3

1School of Management, Shanghai University, Shanghai 200444, P. R. China2Anyang Normal University, Anyang 455000, P. R. China

3Southern Polytechnic State University, Georgia 30060, USAzhou [email protected]

Received December 2013; revised February 2014

Abstract. Clustering models based on objective functions are one of the most famousapproaches in clustering algorithms. These models are often solved by using some alter-nating optimization algorithms with the necessary conditions for local extrema. Alter-nating cluster estimation is a generalized scheme of alternating optimization algorithms,which does not use the objective function and allows the users to select membership andprototype functions directly through an alternating iteration architecture. A general ap-proach of alternating cluster estimation based on possibilistic membership functions wasdeveloped by Zhou and Hung [1], where the memberships of feature points are estimatedby the predetermined membership functions, and the cluster centers are updated via a per-formance index with possibilistic membership weights. The membership functions play avital role in that approach, which should be properly provided in order to obtain good clus-tering results. Nevertheless, the method for how to determine the appropriate functionswas not given. For a further investigation in alternating cluster estimation for possibilis-tic clustering, a general framework for constructing membership functions in this modelis introduced by combining the clustering performance with the fuzzy set theory. In ad-dition, four specic generalized possibilistic clustering algorithms are recommended forapplications. Finally, a comparative study based on real data experiments is presented todemonstrate the performance and eciency of the proposed algorithms.Keywords: Fuzzy clustering, Possibilistic clustering, Alternating cluster estimation,Fuzzy set theory

1. Introduction. Cluster analysis is the process of partitioning a data set into subsets ofobjects which have similar properties. Clustering methods have been used extensively incomputer vision and pattern recognition. Fuzzy clustering is an approach using the fuzzyset theory as a tool for data grouping, which has advantages over traditional clusteringin many applications such as pattern recognition [2, 3], image segmentation [4, 5], etc.In the eld of fuzzy clustering analysis, the fuzzy c-means (FCM) clustering algorithm

presented by Bezdek [6] is the most well-known and widely used method, based on afuzzy objective function with probabilistic membership weights. Several recent papersdealing with the FCM and their derivatives can be found in Chen et al. [7], Li et al. [8],and Miyamoto et al. [9]. Since the memberships resulting from the FCM do not alwayscorrespond to the intuitive concept of degrees of belongingness or compatibility, Krishna-puram and Keller [10, 11] developed the possibilistic clustering algorithms (PCAs), wherethe memberships provide a good explanation of degrees of belongingness for the data.Compared with the FCM, the PCAs are more robust to noise and outliers. Therefore,theoretical and practical studies on the PCAs have received extensive attention in thelast two decades. For instance, Hopner and Klawonn [12] and Zhou et al. [13] presented

Corresponding author. Tel.: +86-21-66134414-805. E-mail address: zhou [email protected] (J. Zhou).

1

2 Q. WANG, J. ZHOU AND C. C. HUNG

some rigorous proofs on the convergence property of the PCAs as a complement of theexperimental results. Krishnapuram et al. [14, 15] generalized the prototype-based fuzzyand possibilistic clustering algorithms by introducing the shell clustering approach, andillustrated their applications to boundary detection and surface approximation. Since theclustering performance of the PCAs in [10, 11] heavily depends on the parameters used,Yang and Wu [16] suggested a new PCA whose performance can be easily controlled.Rhee et al. [17] applied the kernel approach to the possibilistic c-means algorithm bycooperating a variance updating method for Gaussian kernels in each iteration. Otherrecent contributions in this area were made by Anderson et al. [18], Lai et al. [19], Li etal. [21], Treerattanapitak and Jaruskulchai [20], Wen et al. [22], and Xie et al. [23].Essentially, both FCM and PCAs are alternating optimization algorithms driven by

necessary conditions for local extrema of their respective objective functions. The ob-jective functions play an extremely important role in the approaches of fuzzy clusteringanalysis, and hence were studied extensively in the literature. In this area more attentionswere focused on the improvement of membership functions of the FCM (see, e.g., Chenand Wang [24], Miyamoto [25], Rodriguez et al. [26]). Recently, Barcelo-Rico et al. [27]presented a new possibilistic method for discovering linear local behavior by suggestingan objective function with hyper-Gaussian distributed membership, its parameters werefound using global optimization so as to avoid local optima. Instead of using alternatingoptimization, Runkler and Bezdek [28] developed a generalized model using an alternatingiteration architecture, called alternating cluster estimation (ACE), in which the member-ship and prototype functions are selected directly by the users. They stated that ACE is anew and eective tool for clustering and function approximation [29], and virtually everyclustering model can be realized as an instance of ACE. However, they didn't explain howto select proper functions for dierent real data sets. Besides, Zhou and Hung [1] pro-posed a similar procedure especially for possibilistic clustering, and discussed the generalcharacteristics of membership and prototype functions used in the approach with details.The algorithms derived are called generalized possibilistic clustering algorithms (GPCAs).Zhou et al. [30, 31] also proposed some other alternating clustering estimation algorithmsby extending the possibilistic membership weights to the normalized possibilistic andcredibilistic membership weights.It is obvious that dierent membership functions may have diverse eciency when

dealing with real problems. In order to dene good clusters, it is important to providea suitable membership function for each cluster. To tackle this problem, we introducea general procedure for the users to determine the membership functions in ACE forpossibilistic clustering by combining both the clustering performance and the fuzzy settheory, which leads to a denite GPCA by providing some proper values of parameters.Furthermore, four specic algorithms are recommended for various real data sets, and thentheir experimental results are illustrated with real data including Iris data, car evaluationdata, and beer tastes data. The comparative results with the K-means algorithm andexisting PCAs show that the proposed algorithms are eective with better performanceswhen solving these practical application problems.The rest of this paper is organized as follows. In Section 2, the approach of pos-

sibilistic clustering as well as the existing PCAs are briey reviewed. Subsequently, amethod of determining the membership function for a cluster in the generalized approachto possibilistic clustering algorithms is suggested, and four specic functions are then rec-ommended for real applications in Section 3. In Section 4, four new GPCAs are obtainedby using the specic functions in the membership evaluation equations. Finally, someexperimental results of real data sets for dierent applications are presented in Section 5.

2. Possibilistic Clustering Algorithms. Since its introduction in 1965 by Zadeh [32],the fuzzy set theory has been well developed and applied in a wide variety of real problems.

A GENERAL FRAMEWORK FOR CONSTRUCTING POSSIBILISTIC MEMBERSHIP FUNCTIONS 3

In clustering, a great deal of research on fuzzy clustering has been accomplished in theliterature, in which the fuzzy c-means algorithm developed by Bezdek [6] is the mostwell known. Since the normalization condition in the FCM assigns noise points impropermemberships in each cluster according to the fuzzy set theory, the memberships resultingfrom the FCM do not always correspond to the intuitive concept of degree of belongingnessor compatibility. In order to produce memberships with a good explanation of degree ofbelongingness for the data, the PCAs were proposed by Krishnapuram and Keller [10, 11].Given a data set X = fx1;x2; ;xng in a p-dimensional Euclidean space, an ordinary

Euclidean norm jj jj on 1, the approach of possibilistic clustering initiatedby Krishnapuram and Keller [10, 11] is to nd the optimal membership matrix and thecluster center matrix A which minimize the objective function

JPCA93(;A) =cP

i=1

nPj=1

(ij)md2ij +

cPi=1

inP

j=1

(1 ij)m (1)and

JPCA96(;A) =cP

i=1

nPj=1

ijd2ij +

cPi=1

inP

j=1

(ij lnij ij); (2)respectively, subject to the membership constraint

2 UX =( 0 ij 1; cX

i=1

ij > 0; i = 1; 2; ; c; j = 1; 2; ; n); (3)

where ij represents the degree of compatibility or membership of feature point xj be-longing to cluster i, dij represents the Euclidean distance between the ith cluster centerai and the jth feature point xj, i.e.,

dij =qjjai xjjj2; i = 1; 2; ; c; j = 1; 2; ; n: (4)

For convenience, we denote the index sets of clusters and feature points as I = f1; 2; ; cgand J = f1; 2; ; ng, respectively. The distance parameters i, i 2 I, are user specied,which were recommended by Krishnapuram and Keller [10] as

i = K

Pnj=1(ij)

md2ijPnj=1(ij)

mor i =

Pnj=1(ij)

d2ijPnj=1(ij)

(5)

where the parameter K > 0 is typically chosen to be one, and is the predeterminedthreshold with 0 < < 1, which gives the crisp partition

(ij) =

1; if ij 0; otherwise.

(6)

The necessary conditions for a minimizer (;A) of JPCA93 and JPCA96 subject to theconstraint 2 UX deduce two related possibilistic clustering algorithms, denoted asPCA93 and PCA96 respectively, which are both iterative algorithms with the followingupdate equations for memberships8>>>>>>>>>:

ij =1

1 +

d2iji

1=(m1) for (i; j) 2 (I; J) (PCA93)ij = exp

d

2ij

i

for (i; j) 2 (I; J) (PCA96)

(7)

in PCA93 and PCA96, respectively, and the following update equation for cluster centers

ai =

Pnj=1(ij)

mxjPnj=1(ij)

mfor i 2 I (8)


in both PCA93 and PCA96.Since the performance of possibilistic clustering proposed in [10, 11] heavily depends on

the selection of parameters i, a new PCA was developed by Yang and Wu [16] denotedas PCA06, whose performance can be easily controlled. The objective function used inPCA06 is

JPCA06(;A) =cX

i=1

nXj=1

(ij)md2ij +

m2pc

cXi=1

nXj=1

[(ij)m ln(ij)

m (ij)m] ; (9)

and accordingly the update equation for memberships is

ij = exp

m

pc d2ij

for (i; j) 2 (I; J); (10)

where the parameter is dened by

=

Pnj=1 jjxj xjj2

nwith x =

Pnj=1 xj

n: (11)

The same update equation (8) for cluster centers in PCA93 and PCA96 is also used inPCA06.

3. Membership Function in ACE for Possibilistic Clustering. It is easy to seethat the only dierence among PCA93, PCA96 and PCA06 is the evaluation equationof membership, which is a function of distance dij. Based on the analogous principle ofalternating cluster estimation in [28], Zhou and Hung generalized the PCAs in [10, 11, 16]to a family of iterative clustering algorithms, called the generalized possibilistic clusteringalgorithms [1], in which the memberships are calculated by

ij = fi(dij) for (i; j) 2 (I; J); (12)with the membership function fi satisfying8 0: (14)

The following example illustrates a methodology to provide an appropriate membershipfunction for a fuzzy set.


Example 3.1. Suppose that V is the set of ages of people in a residential area. Then theset \old people" in this residential area can be modeled as a fuzzy set on V , denoted as ~A.Let the standard of ~A be x0 = 80 years old, and the dissimilarity between the standard x0and the element x 2 V is dened by

d(x; x0) =

0; if x x0

x0 x; if 0 < x < x0 (15)

where \standard" represents the semantic centre of the fuzzy set, and \dissimilarity"means the measurement of the distance or dierence. In order to give a proper mono-tone decreasing function f for evaluating the memberships of x 2 V , additive preferenceinformation on the fuzzy set \old people" must be used. For example, it is reasonable toassume that the membership of 40 years old is 0.01, and the membership of 60 years oldis 0.5. After that, suppose that the monotone decreasing function fp in (14) is employed.Considering all the assumptions in this example, we then obtain the following equations,8>>>:

1

1 + k(80 40)b = 0:011

1 + k(80 60)b = 0:5:(16)

The solution of (16) is k = 206:6 and b = 6:6, and the deduced function is f p (y) =1=(1 + 206:6y6:6). Consequently, the membership function of ~A is

p(x) = fp (d(x; x0)) =

1; if x 80

(1 + 206:6(80 x)6:6)1; if 0 < x < 80: (17)

If the monotone function fe in (14) is adopted, accordingly the membership function of ~Ais

e(x) = exp f2 104d(x; x0)2:7g =

1; if x 80exp f2 104(80 x)2:7g; if 0 < x < 80:

(18)The two membership functions p in (17) and

e in (18) are shown in Figure 1.

Figure 1. Two Membership Functions of the Fuzzy Set \Old People in aResidential Area"


Remark 3.1. For a fuzzy set, many functions can be used to evaluate its membershipsproperly according to the opinion of the decision-maker. In Example 3.1, all the param-eters including the standard x0 = 80, the dissimilarity d(x; x0) dened in (15), the tworeference elements 40 and 60 together with the corresponding memberships 0.01 and 0.5,and the monotone decreasing function fp or fe used are predetermined by the decision-maker, and obviously dierent selections of parameters will lead to diverse membershipfunctions.

From Example 3.1, a procedure to provide a proper membership function ~A for a fuzzy

set ~A is summarized as follows.

Algorithm 1: (Determination of Membership Function of a Fuzzy Set)Step 1: Predetermine a standard x0 and a dissimilarity d(x; x0) between the elementx and x0 of the fuzzy set ~A.

Step 2: Decide a monotone decreasing function f satisfying f(0) = 1 and f(+1) = 0.Step 3: Give some reference elements and their corresponding memberships accordingto the knowledge of experts or by estimation, in which the number of referenceelements equals to the number of unknown parameters in f .

Step 4: Solve the equations that combine the information of functions d, f , and thereference elements together with the memberships, and obtain the membership func-tion ~A(x) = f(d(x; x0)) of

~A as shown in (16)-(18).

3.2. Membership Function of a Cluster. The methodology of deciding a membershipfunction for a fuzzy set in Section 3.1 can be applied in fuzzy clustering to obtain amembership function for each cluster in each iteration in the generalized approach ofpossibilistic clustering algorithms [1]. The procedure of providing a proper membershipfunction for each cluster i (i 2 I) is introduced in detail as follows.

Algorithm 2: (Determination of Membership Function of a Cluster)Step 1: It is natural to select the cluster center ai; i 2 I as the standard of fuzzycluster i and measure the dissimilarity between the feature point xj 2 X; j 2 Jand the standard ai by the Euclidean distance dij in (4).

Step 2: Generally speaking, any monotone decreasing function f with f(0) = 1 andf(+1) = 0 can be used for evaluating the memberships in fuzzy clustering. In thissection, we only investigate the functions fp and fe dened in (14) for the sake ofsimplicity.

Step 3: Decide a max-distance dmax and a mid-distance dmid for each cluster i withf(dmax) = 0 and f(dmid) = 1, in which 0 2 (0; 1) is a small number in (0, 1) and1 2 (0; 1) is a relative large number. Typically set 0 = 0:01 and 1 = 0:5. It followsthat the mid-distance dmid is the distance at which the membership value of a featurepoint in the cluster becomes 0.5, and the max-distance dmax is the distance at whichthe membership value of a point becomes 0.01.

Step 4: Combine the information of all the parameters and functions to obtain themembership functions p and e corresponding to fp and fe, respectively, as follows,8


maximum distance that a cluster can \see". Thus, it is a good idea to use dmax = dmidby introducing a weighting parameter > 1. An appropriate vector of (dmid; ) woulddetermine a corresponding membership function when the functions fp and fe are used,respectively. In this paper, we set the weighting parameter = 10 for our purpose. Con-sequently, the setting of = 10 in (19) leads to b0p 2 and b0e 0:82, and we obtain thefollowing membership functions,8


Now let us decide some appropriate distances dmid. It is reasonable to assume thatthe mid-distance dmid is proportional to the distance

fi or

i for i 2 I, i.e., setting

dmid = Wfi or dmid = W

i , where W is an appropriate constant required to be deter-

mined. Obviously, dierent constants would result in clustering algorithms with diverseeciency. In order to obtain some proper values for the parameter W , enormous numer-ical experiments based on randomly generated data sets are run instead of theoreticalanalysis, which seems not easy for this problem. According to the results of enormous nu-merical experiments, four membership functions which perform well with better clusteringperformance, and thus are recommended for real applications as listed follows,8>>>>>>>>>:

1(xj) = [1 + (mc)3(dij=

fi )

2]1 by letting dmid = (mc)3=2fi in f

0p

2(xj) = [1 + (mc)3(dij=

i )

2]1 by letting dmid = (mc)3=2i in f0p

3(xj) = exp f ln 2(mc)0:41(dij=fi )0:82g by letting dmid = (mc)1=2fi in f 0e4(xj) = exp f ln 2(mc)0:41(dij=i )0:82g by letting dmid = (mc)1=2i in f 0e

(22)

where f 0p and f0e are dened in (20), and c is the number of clusters and m is the fuzzier.

It is easy to deduce that the constant W of the distance dmid used in the above fourmembership functions is (mc)3=2 or (mc)1=2.

4. Four Specic GPCAs. In this section, four GPCAs are presented by combiningthe generalized approach of possibilistic clustering algorithms in [1] and the four specicmembership functions for clusters recommended in Section 3.3. These GPCAs are alsoiterative algorithms based on ACE, in which the membership evaluation equations in eachiteration are

ij = k(xj); k = 1; 2; 3; 4; (23)

and the update equation for cluster centers is

ai =

Pnj=1(ij)

mxjPnj=1(ij)

mfor i 2 I; (24)

where k are functions given in (22). For convenience, the GPCAs with the updateequations (23) and (24) based on the membership functions 1; 2; 3; 4 are denoted asGPCA1, GPCA2, GPCA3 and GPCA4, respectively. The algorithms of four new GPCAsare summarized as follows by taking 1 as an example.

Algorithm 3: (GPCA1 Based on 1)

Step 1: Initialize (0)ij 2 [0; 1] and a(0)i 2 0 and the iteration counter t = 0, set the number of clusters c, and the fuzzierm.

Step 2: Calculate (fi )(t) with

(fi )(t) =

vuutPnj=1((t)ij )m (d(t)ij )2Pnj=1(

(t)ij )

m(25)

for i 2 I, and then estimate the distance d(t)mid by d(t)mid = (mc)3=2(fi )(t), where

d(t)ij =

qjja(t)i xjjj2 for (i; j) 2 (I; J): (26)

Step 3: Compute (t+1)ij for all (i; j) 2 (I; J) with

(t+1)ij = [1 + (mc)

3(d(t)ij =(

fi )

(t))2]1: (27)


Step 4: Compute a(t+1)i for all i 2 I with

a(t+1)i =

Pnj=1(

(t+1)ij )

mxjPnj=1(

(t+1)ij )

m: (28)

Step 5: Increment t until maxi jja(t+1)i a(t)i jj < .

5. Computational Experiments. In this section, three real data sets including Irisdata, car evaluation data and beer tastes data are used to illustrate the cluster per-formance of the four new GPCAs recommended in Section 4, in which the real la-bels of feature points are known. We implement PCA93k(1), PCA93(0:3), PCA96k(1),PCA96(0:3), PCA06, GPCA1, GPCA2, GPCA3, and GPCA4, where PCA93k(1) andPCA96k(1) means the algorithms using the parameter

i = K

Pnj=1(ij)

md2ijPnj=1(ij)

m(29)

with K = 1, and PCA93(0:3) and PCA96(0:3) means the algorithms using the param-eter

i =

Pnj=1(ij)

d2ijPnj=1(ij)

(30)

with = 0:3 in the membership update equation (7). Note that the parameters i havenot been reestimated in each iteration during all the clustering process according to thesuggestion presented in [10]. Besides, we set the parameter as 0.3 in both GPCA2 andGPCA4.Since the results of the PCAs and the GPCAs heavily depend on the initialization,

a reasonably good initialization is required for all the algorithms. In this section, theK-means algorithm is used to obtain the initial cluster centers. For each clustering al-gorithm, 1000 experiments with initial centers generated by the K-means algorithm areimplemented to see the robustness of all the PCAs and GPCAs to the initial settings. Thefuzzier m has dierent inuences on these algorithms. In this paper, all the algorithmsare run with fuzzier m = 2 for 1000 experiments for comparison since it is recommendedthat this value will generate relatively desirable results. In order to compare the resultsobtained by running these clustering algorithms, the clustering results are evaluated bytheir overall accuracies in this paper. The larger overall accuracy means better clusteringresults, and the lower variance represents more robust to the initialization. The maximalmapping number criterion introduced in [1] is employed to obtain the clustering map ofeach result. The interested reader may consult [1] for detail explanation.

5.1. Iris Data. The Iris ower data or Fisher's Iris data in [33, 34] is a multivariate dataset introduced by Ronald Fisher in 1936 as an example of discriminant analysis. In thispaper, we use the recommended new GPCAs to cluster the Iris data with the comparisonof the existing PCAs. The data consists of 50 samples from each of three species of Iris(Iris setosa, Iris virginica and Iris versicolor). Four features were measured in centimetresfrom each sample including the length and the width of the sepals and petals. Amongthe three clusters, two of the clusters have substantial overlapping.The analysis on the overall accuracies of all the clustering results are given in Table 1,

where the columns \Highest", \Mean" and \Variance" below \Overall Accuracy '" pro-vide the highest one, mean and variance of the overall accuracies of 1000 experimentsfor each algorithm, respectively. In order to show the accuracy distribution of 1000 clus-tering results, the numbers of experiments with overall accuracies in the regions [0:9; 1],


Table 1. A Comparison of 1000 Experiments for the Iris Data

Clustering Overall Accuracy ' (%) Accuracy Distribution IterationAlgorithm Highest Mean Variance [0:9; 1] [0:8; 0:9) [0; 0:8) Number

The Existing PCAsPCA93k(1) 74.00 69.45 1.87 0 0 1000 20.98PCA93(0:3) 70.00 69.29 0.15 0 0 1000 21.96PCA96k(1) 75.33 68.88 5.61 0 0 1000 25.16PCA96(0:3) 69.33 68.30 0.19 0 0 1000 26.87

PCA06 92.00 80.83 324.18 722 0 278 45.57The New GPCAs

GPCA1 94.00 86.84 181.97 620 239 141 50.70GPCA2 94.00 86.72 199.64 723 122 155 54.21GPCA3 94.00 86.14 251.89 803 0 197 251.89GPCA4 94.00 83.08 320.95 729 0 271 78.49

Table 2. Analysis of 1000 Experiments for the Iris Data by PCA93k(0:02)

Clustering Overall Accuracy ' (%) Accuracy Distribution IterationAlgorithm Highest Mean Variance [0:9; 1] [0:8; 0:9) [0; 0:8) Number

PCA93k(0:02) 93.33 88.08 128.76 915 0 85 24.312

[0:8; 0:9) and [0; 0:8) are counted and given in the three columns below \Accuracy Dis-tribution". The last column \Iteration Number" shows the average iteration number of1000 experiments for each algorithm.As shown in Table 1 that all the algorithms except PCA93k(1), PCA93(0:3), PCA96k(1)

and PCA96(0:3) are ecient for clustering the Iris data with good results. More than62% of experiments of the GPCAs provide the clustering results with the overall accu-racies larger than 90%. Furthermore, we can see that although it is possible for theclustering algorithms PCA06, GPCA1, GPCA2, GPCA3, and GPCA4 to provide clus-tering results with higher overall accuracies than PCA93k(1), PCA93(0:3), PCA96k(1)and PCA96(0:3) (the highest one even achieves 94.00% by the four new GPCAs), how-ever, the variances of all these algorithms including PCA06, GPCA1, GPCA2, GPCA3,and GPCA4 are much larger than those of PCA93k(1), PCA93(0:3), PCA96k(1) andPCA96(0:3). It means that PCA93k(1), PCA93(0:3), PCA96k(1) and PCA96(0:3)is more robust to the initialization than the new GPCAs according to the experimentson the Iris data. In other words, GPCA1, GPCA2, GPCA3, GPCA4 are better thanPCA93k(1), PCA93(0:3), PCA96k(1) and PCA96(0:3) on overall accuracy, but the costis the lack of robustness.

Remark 5.1. The parameter K in (29) plays an important role in PCA93 and PCA96.It follows from Table 1 that the typical value K = 1 is not appropriate for PCA93 andPCA96 when clustering the Iris data. In fact, we run the algorithm PCA93k(0:02) (i.e.,set K = 0:02 in (29)) for the Iris data for 1000 experiments and obtain the result inTable 2, which is much better than that of PCA93k(1). Furthermore, the inuence of Kon PCA93 and PCA96 is dierent. In our paper, we would not discuss the selection ofthe parameter K on PCA93 and PCA96.


Table 3. A Comparison of 1000 Experiments for the Car Evaluation Data

Clustering Overall Accuracy ' (%) Accuracy Distribution IterationAlgorithm Highest Mean Variance [0:8; 1] [0:7; 0:8) [0; 0:7) NumberThe Existing PCAsPCA96k(1) 49.50 49.27 0.07 0 0 1000 42.42PCA06 89.00 81.25 137.35 696 0 304 55.31

The New GPCAsGPCA1 93.50 84.44 189.17 720 0 280 80.32GPCA2 93.50 81.90 217.48 638 0 362 75.30GPCA3 93.00 89.07 92.20 872 0 128 61.45GPCA4 93.00 89.09 90.02 875 0 125 60.30

Remark 5.2. Similarly, the parameters and have great inuence on the GPCAs.However, in this paper, we only consider the case when = 10 and = 0:3 in the fournew GPCAs.

5.2. Car Evaluation Data. The second data set comes from an evaluation of a specialcar by dierent users. The quality of the car is measured by two main criteria: price andtechnical characteristics. 200 users were asked to mark the same car based on the priceand technology, and nally, the car evaluation data set was generated in this way. Thisdata set contains 200 points in a 2-dimensional space, and is classied into 4 clusters,which represent unacceptable, acceptable, good and very good. Each cluster has 46, 60,45 and 49 data, respectively, as shown in Figure 3.

Figure 3. The Car Evaluation Data Set

In this example, we run the PCA96k(1), PCA06, GPCA1, GPCA2, GPCA3, andGPCA4 for 1000 experiments with fuzzier m = 2, and the results are shown in Ta-ble 3.It follows from Table 3 that the new GPCAs are superior to PCA96k(1) since it seems

that PCA96k(1) is not eective to this data set. It can also be seen that GPCA1 andGPCA2 have the same eciency with PCA06 both from the perspective of eciencyand variance. In addition, GPCA3 and GPCA4 are obviously superior to PCA06 with ahigher average accuracy. Furthermore, in terms of the variance, it follows from Table 3


Table 4. A Comparison of 1000 Experiments for the Beer Tastes Data

Clustering Overall Accuracy ' (%) Accuracy Distribution IterationAlgorithm Highest Mean Variance [0:8; 1] [0:7; 0:8) [0; 0:7) NumberThe Existing PCAsPCA06 77.33 75.65 8.62 0 877 123 90.88

The New GPCAsGPCA1 84.67 78.48 39.50 648 258 94 19.69GPCA2 84.67 80.50 30.61 804 139 57 20.62GPCA3 85.00 81.36 21.468 830 127 43 8.395GPCA4 85.00 81.50 29.39 852 55 93 8.279K-MeansK-Means 86.00 80.61 43.82 777 82 141 13.421

that the results of GPCA3 and GPCA4 ensure a relatively more stable result, which is asignicant factor in practical applications. From this example, we can see that the newGPCAs provide results with relatively better average accuracy while more robustness toinitialization (the variance is relatively lower).

5.3. Beer Tastes Data. There are various brands of beer in the market, and variouskinds of beer have dierent tastes. In this subsection, we choose six top brands of beer,and obtain a data set with two attributes, which represent the values of bitterness andalcohol concentration. In this way, a beer tastes data set is generated with 300 points and6 clusters in a 2-dimensional space, and each cluster has 91, 66, 41, 60, 13 and 29 points,respectively, as shown in Figure 4.

Figure 4. The Beer Tastes Data Set

In this example, we run the K-means, PCA06, GPCA1, GPCA2, GPCA3, and GPCA4for 1000 experiments with fuzzier m = 2, and Table 4 presents all the clustering results.It follows from Table 4 more than 64% of experiments of the GPCAs provide the

clustering results with the overall accuracies larger than 80%. Comparing PCA06 withall the new GPCAs, we can see that though it is possible for the generalized possibilisticclustering algorithms to provide clustering results with higher overall accuracies thanPCA06 (the highest one even achieves 85.00% by GPCA3 and GPCA4), however, the


variances of all the GPCAs including GPCA1, GPCA2, GPCA3 and GPCA4 are muchlarger than that of PCA06. It means that PCA06 is more robust to the initialization thanthe GPCAs according to the experiments on the Beer Tastes Data. In addition, from theperspective of average accuracy, it is obvious that the accuracy of new GPCAs exceedsthat of PCA06. When comparing the new GPCAs with the K-means algorithm, it seemsthat all of them have similar eciency in terms of both the average accuracy and thevariance, which are almost at the same level.

5.4. Summary. Finally, we summarize the result analysis of all the numerical experi-ments on the possibilistic clustering algorithms and the generalized possibilistic clusteringalgorithms in Tables 1-4 by emphasizing the following points.a) PCA93 and PCA96 can provide good clustering results with an appropriate parameterK. However, the typical selection K = 1 is not appropriate for all the data sets, and theinuence of K on PCA93 and PCA96 is dierent. So in order to perform the clusteringeciently, the selection of parameter K must be discussed.b) PCA06 is a good clustering algorithm among the existing PCAs because of its easycontrol for diverse real applications.c) The four new algorithms GPCA1, GPCA2, GPCA3 and GPCA4 are also ecient forclustering all the data sets in this paper when = 10 and = 0:3.d) Compared with the existing PCAs on the convergence speed (i.e., the iteration numbershown in the last columns of Tables 1-4), the four new GPCAs converge much faster withbetter clustering results.

Remark 5.3. The conclusions drawn above are based on the clustering results of thethree real data sets experiments. In order to extend these conclusions, theoretic analysisor experiments on more data sets are necessary.

6. Conclusion. Fuzzy clustering is an approach using the fuzzy set theory as a tool fordata grouping. In the literature, many fuzzy clustering algorithms based on the objectivefunction have been proposed including PCA93, PCA96, PCA06. However, it seems thatnding an appropriate objective function with good clustering performance is not easy andalso not necessary from the practical application point of view. Hence, alternating clusterestimation does not use the objective function and allows the users to select membershipand prototype functions directly through an alternating iteration architecture. In [1],an application of alternating cluster estimation for possibilistic clustering was initialized,where the membership functions of clusters can be a monotone decreasing function sat-isfying (13). The membership functions should be predetermined by the decision-makerin practice, and various functions lead to the clustering results with dierent accuracies.In [1], the problem that how to provide the membership functions was not discussed. Itis obviously necessary to make a further discussion on the membership functions.As a continuous work of [1], this paper tries to present a method of determining an

appropriate membership function for each cluster in the clustering process. By combiningthe clustering process with the fuzzy set theory, a methodology of determining a propermembership function for each cluster is demonstrated, and then four specic functionsare recommended for real applications. By introducing these new functions into themembership evaluation equations in the generalized possibilistic clustering algorithmsin [1], four new GPCAs are obtained as a result, which have been illustrated by three realdata experiments. The results compared with the existing PCAs showed that the newproposed algorithms are ecient for clustering these real data sets with easy performance.Stability of algorithms is necessary for real applications. From the results of numerical

experiments in Section 5, we see that both the PCAs and the GPCAs are not very stablewhen clustering all the data sets. As the future work in this area, it is necessary to explore


why the GPCAs are unstable and how to improve the stability of the GPCAs throughtheoretic analysis and more real data experiments.

Acknowledgments. This work was supported in part by grants from the InnovationProgram of Shanghai Municipal Education Commission (No. 13ZS065), the NationalSocial Science Foundation of China (No. 13CGL057), and the Ministry of EducationFunded Project for Humanities and Social Sciences Research (No. 12JDXF005).

REFERENCES

[1] J. Zhou and C. C. Hung, A generalized approach to possibilistic clustering algorithms, InternationalJournal of Uncertainty, Fuzziness and Knowledge-Based Systems, vol.15, supp.2, pp.110-132, 2007.

[2] A. Baraldi and P. Blonda, A survey of fuzzy clustering algorithms for pattern recognition - part I,IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, vol.29, no.6, pp.778-785, 1999.

[3] A. Baraldi and P. Blonda, A survey of fuzzy clustering algorithms for pattern recognition - part II,IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics, vol.29, no.6, pp.786-801, 1999.

[4] D. Li and C. Zhong, Segmentation of images with damaged blocks based on fuzzy clustering, ICICExpress Letters, vol.6, no.10, pp.2679-2684, 2012.

[5] P. Wen, J. Zhou and L. Zheng, Hybrid methods of spatial credibilistic clustering and particle swarmoptimization in high noise image segmentation, International Journal of Fuzzy Systems, vol.10, no.3,pp.174-184, 2008.

[6] J. C. Bezdek, Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NewYork, 1981.

[7] Z. Chen, W. Hong and C. Wang, Fuzzy clustering algorithm of kernel for gene expression dataanalysis, ICIC Express Letters, vol.3, no.4, pp.1435-1440, 2009.

[8] D. Li, H. Gu and L. Zhang, A hybrid genetic algorithm-fuzzy c-means approach for incomplete dataclustering based on nearest-neighbor intervals, Soft Computing, vol.17, no.10, pp.1787-1796, 2013.

[9] S. Miyamoto, H. Ichihashi and K. Honda, Algorithms for Fuzzy Clustering, Springer-Verlag, Berlin,2008.

[10] R. Krishnapuram and J. M. Keller, A possibilistic approach to clustering, IEEE Transactions onFuzzy Systems, vol.1, no.2, pp.98-110, 1993.

[11] R. Krishnapuram and J. M. Keller, The possibilistic c-means algorithm: Insights and recommenda-tions, IEEE Transactions on Fuzzy Systems, vol.4, no.3, pp.385-393, 1996.

[12] F. Hopner and F. Klawonn, A contribution to convergence theory of fuzzy c-means and derivatives,IEEE Transactions on Fuzzy Systems, vol.11, no.5, pp.682-694, 2003.

[13] J. Zhou, L. Cao and N. Yang, On the convergence of some possibilistic clustering algorithms, FuzzyOptimization and Decision Making, vol.12, no.4, pp.415-432, 2013.

[14] R. Krishnapuram, H. Frigui and O. Nasraoui, Fuzzy and possibilistic shell clustering algorihm andtheir application to boundary detection and surface approximation - Part I, IEEE Transactions onFuzzy Systems, vol.3, no.1, pp.29-43, 1995.

[15] R. Krishnapuram, H. Frigui and O. Nasraoui, Fuzzy and possibilistic shell clustering algorihm andtheir application to boundary detection and surface approximation - Part II, IEEE Transactions onFuzzy Systems, vol.3, no.1, pp.44-60, 1995.

[16] M. S. Yang and K. L. Wu, Unsupervised possibilistic clustering, Pattern Recognition, vol.39, no.1,pp.5-21, 2006.

[17] F. C. H. Rhee, K. S. Choi and B. I. Choi, Kernel approach to possibilistic C-means clustering,International Journal of Intelligent Systems, vol.24, no.3, pp.272-292, 2009.

[18] D. T. Anderson, J. C. Bezdek, M. Popescu and J. M. Keller, Comparing fuzzy, probabilistic, andpossibilistic partitions, IEEE Transactions on Fuzzy Systems, vol.18, no.5, pp.906-918, 2010.

[19] J. Z. C. Lai, E. Y. T. Juan and F. J. C. Lai, Rough clustering using generalized fuzzy clusteringalgorithm, Pattern Recognition, vol.46, no.9, pp.2538-2547, 2013.

[20] K. Treerattanapitak and C. Jaruskulchai, Possibilistic exponential fuzzy clustering, Journal of Com-puter Science and Technology, vol.28, no.2, pp.311-321, 2013.

[21] X. Li, H. S. Wong and S. Wu, A fuzzy minimax clustering model and its applications, InformationSciences, vol.186, no.1, pp.114-125, 2012.

[22] P. Wen, J. Zhou and L. Zheng, A modied hybrid method of spatial credibilistic clustering andparticle swarm optimization, Soft Computing, vol.15, no.5, pp.855-865, 2011.


[23] Z. Xie, S. Wang and F. L. Chung, An enhanced possibilistic C-Means clustering algorithm EPCM,Soft Computing, vol.12, no.6, pp.593-611, 2008.

[24] M. S. Chen and S. W. Wang, Fuzzy clustering analysis for optimizing fuzzy membership functions,Fuzzy Sets and Systems, vol.103, no.2, pp.239-254, 1999.

[25] S. Miyamoto, Dierent objective functions in fuzzy c-means algorithms and kernel-based clustering,International Journal of Fuzzy Systems, vol.13, no.2, pp.89-97, 2011.

[26] A. Rodriguez, M. S. Tomas and J. Rubio-Martinez, A benchmark calculation for the fuzzy c-means clustering algorithm: Initial memberships, Journal of Mathematical Chemistry, vol.50, no.10,pp.2703-2715, 2012.

[27] F. Barcelo-Rico, J. L. Dez and J. Bondia, New possibilistic method for discovering linear local be-havior using hyper-Gaussian distributed membership function, Knowledge and Information Systems,vol.30, pp.377-403, 2012.

[28] T. A. Runkler and J. C. Bezdek, Alternating cluster estimation: A new tool for clustering andfunction approximation, IEEE Transactions on Fuzzy Systems, vol.7, no.4, pp.377-393, 1999.

[29] T. A. Runkler and J. C. Bezdek, Function approximation with polynomial membership functionsand alternating cluster estimation, Fuzzy Sets and Systems, vol.101, no.2, pp.207-218, 1999.

[30] J. Zhou, C. C. Hung, J. He and Y. Luo, Normalized possibilistic clustering algorithms, Proceedingsof the Sixth International Conference on Information and Management Sciences, Lhasa, China, July1-6, 2007, pp.397-403.

[31] J. Zhou, C. C. Hung, X. Wang and S. Chen, Fuzzy clustering based on credibility measure, Proceed-ings of the Sixth International Conference on Information and Management Sciences, Lhasa, China,July 1-6, 2007, pp.404-411.

[32] L. A. Zadeh, Fuzzy sets, Information and Control, vol.8, no.3, pp.338-353, 1965.[33] E. Anderson, The irises of the Gaspe peninsula, Bulletin of the American Iris Society, vol.59, pp.2-5,

1935.[34] J. C. Bezdek, J. M. Keller, R. Krishnapuram, L. I. Kuncheva and N. R. Pal, Will the real iris data

please stand up?, IEEE Transactions on Fuzzy Systems, vol.7, no.3, pp.368-369, 1999.

2014+a+general+framework+for+constructing+possibilistic+membership+functions+in+alternating+cluster+estimation...

Documents

fcm clustering

clustering performance

clustering models

clustering methods

alternating cluster

objective functions

possibilistic clustering

cluster analysis