hierarchical hesitant fuzzy k-means clustering algorithm

Appl. Math. J. Chinese Univ.2014, 29(1): 1-17

Hierarchical hesitant fuzzy K-means clustering algorithm

CHEN Na1,2 XU Ze-shui1,3,∗ XIA Mei-mei4

Abstract. Due to the limitation and hesitation in one’s knowledge, the membership degree of

an element to a given set usually has a few different values, in which the conventional fuzzy sets

are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses

on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering

algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two

examples demonstrate the validity of our algorithm.

§1 Introduction

Clustering analysis, as one of the widely-adopted key tools in handling statistical data, has

been applied to the fields of pattern recognition [2,22], data mining [9,24], information retrieval

[17,28], and so on [3,10,23]. Clustering refers to a process that combines a set of objects into

clusters with respect to the characteristics of data, and the objects belonging to a cluster have a

higher similarity than that of different clusters. It can be employed to deal with various types of

data information, such as numerical information, interval value information, fuzzy information

[12,15,31,32]. Since several extensions of fuzzy sets theory [39] were proposed [1,7,18,36], a lot

of fuzzy clustering algorithms have been developed, e.g. intuitionsitc fuzzy clustering [5,31-

33,40], type-2 fuzzy clustering [11,37], and clustering based on fuzzy multisets [17,29]. In [29],

Torra and Miyamoto focused on clustering techniques and more specifically on the use of fuzzy

c-means. They also have studied the consistency of applying fuzzy c-means and fuzzy c-means

with weights to both fuzzy sets and fuzzy multisets. Torra and Narukawa recently introduced

the concept of hesitant fuzzy sets (HFSs) [26,27] which generalizes that of fuzzy sets. HFSs

permit the membership degree of an element to a given set to have a few different values, which

is very useful in describing human’s hesitance. Thus, it is necessary to carry out research related

to HFSs. In Refs.[30,35], the hesitant fuzzy aggregation operators and distance measures have

been developed. Correlation coefficients for HFSs [4] have been recently developed and applied

Received: 2012-07-21.MR Subject Classification: 90B50, 68T10, 62H30.Keywords: Hesitant fuzzy set, hierarchical clustering, K-means clustering, intuitionisitc fuzzy set.Digital Object Identifier(DOI): 10.1007/s11766-014-3091-8.Supported by the National Natural Science Foundation of China (61273209).∗ Corresponding author.

2 Appl. Math. J. Chinese Univ. Vol. 29, No. 1

to clustering issues. In the present work we propose a new clustering algothrim for HFSs based

on a combination of hierarchical and K-means methods.

In the last decade, research on clustering methods [6,8,16,21,25,38] has attracted consid-

erable interest due to its increasing applications in various types of problems. The principle

clustering approaches contain hierarchical algorithms, partitioning algorithms, and density-

based algorithms, etc. The hierarchical clustering as a crucial method is either agglomerative

or divisive. It consists of a sequence of iterative steps to partition at different layers. The layers

are constructed by using merge-and-split techniques. Once a group of objects are merged or

split, the next step will operate on the newly generated cluster [6]. The agglomerative method

has been more commonly used [3]. Contrary to hierarchical approach, we should input the

number of cluster in partitioning method.

K-means is one of the latter representatives. The procedure of K-means algorithm is as

follows [20]: First, select somehow an initial partition of the database in the K clusters and

calculate the centroid of each cluster, or select the initial seeds randomly, and then all the

objects are compared with each centroid by means of the distance and assigned to the closest

cluster. The above steps are repeated many times until the changes in the cluster centers from

one stage to the next are close to zero or smaller than a pre-specified value. It has been shown

[13,14,20] that K-means algorithm is sensitive to initial environment, i.e., initial cluster or initial

seeds.

The formulation of K-means clustering algorithms usually assumes that appropriate initial

clusters can be found in advance. However, there has been no universal methods to determine

the initial clusters or the initial cluster centers [20]. Here we shall utilize the results that

hierarchical clustering provides as initial clusters for K-means algorithm. Such utilization can

greatly reduce the iterative times arising from selecting the initial seeds randomly used in the

original K-means algorithm; hence the hybrid methods can significantly accelarate the clustering

process.

As mentioned previously, HFSs can incorporate all possible opinions of all members and thus

provide an intuitionistic description of the differences among group members. For example,

three experts discuss the membership degree of an element x to a set A and assign it with

0.2, 0.4 and 0.6, respectively. These experts cannot reach consistency each other. For such

a circumstance, the degree can be represented by {0.2, 0.4, 0.6}. It is different from that

represented by fuzzy number 0.2 (or 0.4 or 0.6), by interval-valued fuzzy number [0.2, 0.6] and

by intuitionistic fuzzy number (0.2, 0.4), and is called hesitant fuzzy type data here.

Till now, related studies are still scarce for handling hesitant fuzzy type data. The present

paper is devoted to making clustering for this new type of data by combining the determination

of initial clusters provided by the hierarchical method with the K-means approach.

CHEN Na, et al. Hierarchical hesitant fuzzy K-means clustering algorithm 3

§2 Preliminaries

This section reviews some basic concepts concerning HFSs and intuitionistic fuzzy sets (IFSs)

[1], which will be used in the present work.

2.1 Hesitant fuzzy set

In real-life, it is difficult for many people to reach consistency which is caused by their

hesitation. For instance, two experts discuss the membership of x to A, and one wants to

assign 0.3 and the other 0.6. To describe such a case that the membership of an element into a

set which in some circumstances originates from a doubt between a few different values, Torra

[26], Torra and Narukawa [27] suggested the concept of hesitant fuzzy set (HFS) permitting the

membership degree of an element to a set presented as several possible values between 0 and 1.

Definition 1 ([26,27]). Let X be a reference set, a hesitant fuzzy set on X is defined in terms

of a function h(x) that applying to X returns a subset of [0,1].

To be easily understood, Xia and Xu [30] expressed the HFS by a mathematical symbol

A = {< x, hA(x) > |x ∈ X}, (1)

where hA(x) is a set of some different values in [0,1], representing the possible membership

degree of the element x ∈ X to the set A. For convenience, hA(x) is called a hesitant fuzzy

element (HFE), a basic unit of HFS [30].

Example 1. Let X = {x1, x2, x3} be a reference set, hA(x1) = {0.2, 0.4, 0.5}, hA(x2) =

{0.3, 0.4} and hA(x3) = {0.3, 0.2, 0.5, 0.6} be the membership of xi(i = 1, 2, 3) to a set A,

respectively. Then A can be considered as a hesitant fuzzy set represented as:

A = {< x1, {0.2, 0.4, 0.5} >,< x2, {0.3, 0.4} >,< x3, {0.3, 0.2, 0.5, 0.6}>}.

Definition 2 ([26,27]). Given a HFE h, its lower and upper bounds are defined below:

Lower bound: h−(x) = minh(x);

Upper bound: h+(x) = maxh(x).

Definition 3 ([26,27]). Given a HFE h, we define Aenv(h) as the envelope of h which is

represented by (h−, 1− h+) , with h− and h+ being its lower and upper bounds.

2.2 Intuitionistic fuzzy set

In Ref.[1], Atanassov extended Zadeh’s fuzzy set to a general form, called intuitionistic fuzzy

set (IFS).

Definition 4 ([1]). Let X be a fixed set, an intuitionistic fuzzy set (IFS) A on X is defined as

follows:

A = {< x, μA(x), νA(x) > |x ∈ X}, (2)


where the functions μA(x) and νA(x) denote the degrees of membership and non-membership

of the element x ∈ X to the set A, respectively, with the condition:

0 ≤ μA(x) ≤ 1, 0 ≤ νA(x) ≤ 1, 0 ≤ μA(x) + νA(x) ≤ 1 (3)

and πA(x) = 1−μA(x)− νA(x) is usually called the degree of indeterminacy of x to A. Xu [35]

named α = (μα, να) an intuitionistic fuzzy value (IFV).

As can be seen from Definitions 3 and 4, the envelop of a HFE is just an IFV [26,27].

§3 Distance measures and aggregation operators

Let X = {x1, x2, . . . xn} be a discrete universe of discourse. Considering that the elements

xi(i = 1, 2, . . . , n) in the universe X may have different importance, let w = (w1, w2, . . . , wn)T

be the weight vector of xi(i = 1, 2, . . . , n), with wi ≥ 0, i = 1, 2, . . . , n, and∑n

i=1 wi = 1. For

two HFSs M = {< xi, hM (xi) > |xi ∈ X} and N = {< xi, hN (xi) > |xi ∈ X}, the distance

measure between M and N , denoted as d(M,N), should satisfy the following properties:

1) 0 ≤ d(M,N) ≤ 1;

2) d(M,N) = 0 if and only if M = N ;

3) d(M,N) = d(N,M).

It is noted that the number of values in different HFEs may be different, and thus, the

traditional distance measures cannot be used.

Let l(h(xi)) be the number of values in HFE h(xi). We arrange the elements in h(xi) in

decreasing order, and let h(σj)(xi)(j = 1, 2, · · · , n) be the jth largest value in h(xi). To calculate

the distance between two HFSs, let lxi = max{(l(hM(xi)), l(hN (xi))}, for each xi in X . To

operate correctly, we give the following regulation:

When two HFEs have different number of elements, we can make them equivalent through

adding elements for that HFE which has less number of elements. Specifically, if l(hM (xi)) ≤l(hN (xi)), then hM (xi) should be extended by adding the minimum value in it until it has the

same length with hN(xi); If l(hM (xi)) ≥ l(hN(xi)), then hN(xi) should be extended by adding

the minimum value in it until it has the same length with hM (xi).

In principle, we can extend the short set by adding any value belonging to it. The selection

of this value mainly depends on the decision makers’ risk preferences. Optimists anticipate

desirable outcomes and may add the maximum value, while pessimists expect unfavorable

outcomes and may add the minimum value. Although the results may be different if we extend

the short set by adding different values that are contained in the set, they are reasonable because

the decision makers’ risk preferences can directly influence the final decision [34]. In our present

paper, we assume that the decision makers are all pessimistic, so we extend the short set by

adding the minimum value.

It may be mentioned that if the short set contains zero, then we will extend it by adding the

minimum value, i.e., zero, to the set. That is, our principle is that only those values that belong

to the short set will be chosen and added to the set. Such processing can keep more original

information contained in the short set, and it yields less deviation from the real problem as


compared to the case by adding a value that is not present in the original short set.

Based on the well-known Hamming distance and Euclidean distance, the hesitant weighted

Hamming distance and the hesitant weighted Euclidean distance [34] can be defined as follows:

dhwh(M,N) =

n∑

i=1

wi

⎡

⎣ 1

lxi

lxi∑

j=1

∣∣∣h

(σj)M (xi)− h

(σj)N (xi)

∣∣∣

⎤

⎦ , (4)

dhwe(M,N) =

⎧⎨

⎩

n∑

i=1

wi

⎡

⎣ 1

lxi

lxi∑

j=1

∣∣∣h

(σj)M (xi)− h

(σj)N (xi)

∣∣∣2

⎤

⎦

⎫⎬

⎭

1/2

. (5)

Especially, if w = (1/n, 1/n, · · · , 1/n)T , then Eqs.(4) and (5) are reduced to the hesitant

normalized Hamming distance and the hesitant normalized Euclidean distance as follows:

dhnh(M,N) =1

n

n∑

i=1

⎡

⎣ 1

lxi

lxi∑

j=1

∣∣∣h

(σj)M (xi)− h

(σj)N (xi)

∣∣∣

⎤

⎦ , (6)

dhne(M,N) =

⎧⎨

⎩

1

n

n∑

i=1

⎡

⎣ 1

lxi

lxi∑

j=1

∣∣∣h

(σj)M (xi)− h

(σj)N (xi)

∣∣∣2

⎤

⎦

⎫⎬

⎭

1/2

, (7)

where h(σj)M (xi) and h

(σj)N (xi) are the jth largest values in hM (xi) and hN (xi).

Torra and Narukawa [26,27] proposed an aggregation principle for HFSs:

Definition 5 ([26,27]). Let E = {h1, h2, · · · , hn}, be a set of n HFSs, Θ be a function on

E, and Θ: [0,1]N → [0,1], then

ΘE = ∪γ∈{h1×h2×···×hn} {Θ(γ)} . (8)

Xia and Xu [30] gave some operational rules of two HFSs M and N :

1) M ⊕N ={< xi,

⋃r1∈hM (xi),r2∈hN(xi)

{r1 + r2 − r1r2} > |xi ∈ X};

2) λM ={< xi,

⋃r∈hM(xi)

{1− (1− r)λ} > |xi ∈ X};

3) M ⊗N ={< xi,

⋃r1∈hM(xi),r2∈hN (xi)

{r1r2} > |xi ∈ X};

4) Mλ ={< xi,

⋃r∈hM(xi)

{rλ} > |xi ∈ X}.

Let Aj = {< xi, hAj (xi) > |xi ∈ X}(j = 1, 2, · · · ,m) be a collection of HFSs, then their

average can be given as:

f(A1, A2, · · · , Am) =1

m(A1 ⊕A2 ⊕ · · · ⊕Am) (9)


which can be further transferred into Eq.(10) based on the operations for HFSs:

f(A1, A2, · · · , Am) ={⟨

xi,⋃

r1∈hA1(xi),r2∈hA2(xi),··· ,rm∈hAm (xi)

{1−

m∏

j=1

(1− rj)1/m

}⟩|xi ∈ X}.

(10)

Below we will prove Eq. (10) by making use of the mathematical induction on m.

Proof. (1) When m = 2, using operational rules given above, we have

A1 ⊕A2 ={< xi,∪r1∈hA1(xi),r2∈hA2(xi){r1 + r2 − r1r2} > |xi ∈ X

}

={< xi,∪r1∈hA1(xi),r2∈hA2(xi){1− (1− r1)(1− r2)} > |xi ∈ X

}

=

⎧⎨

⎩< xi,∪r1∈hA1 (xi),r2∈hA2 (xi)

⎧⎨

⎩1−

2∏

j=1

(1− rj)

⎫⎬

⎭> |xi ∈ X

⎫⎬

⎭. (11)

Further, with Eq.(9) and operational rules we can obtain

f(A1, A2) =1

2(A1 ⊕A2)

=1

2

{< xi,∪r1∈hA1(xi),r2∈hA2 (xi){1− (1− r1)(1 − r2)} > |xi ∈ X

}

={< xi,∪r1∈hA1(xi),r2∈hA2(xi){1− [1− (1− (1− r1)(1 − r2))]

1/2} > |xi ∈ X}

={< xi,∪r1∈hA1(xi),r2∈hA2(xi){1− [(1− r1)(1− r2)]

1/2} > |xi ∈ X}

=

⎧⎨

⎩< xi,∪r1∈hA1(xi),r2∈hA2(xi)

⎧⎨

⎩1−

2∏

j=1

(1 − rj)1/2

⎫⎬

⎭> |xi ∈ X

⎫⎬

⎭.

(2) Assume Eq. (10) holds for m = k, namely

f(A1, A2, · · · , Ak)

=1

k(A1 ⊕A2 ⊕ · · ·Ak)

=

⎧⎨

⎩< xi,∪r1∈hA1(xi),r2∈hA2(xi),··· ,rk∈hAk

(xi)

⎧⎨

⎩1−

k∏

j=1

(1 − rj)1/k

⎫⎬

⎭> |xi ∈ X

⎫⎬

⎭. (12)

(3) When m = k + 1, following Eq.(9) we have

f(A1, A2, · · · , Ak, Ak+1) =1

k + 1(A1 ⊕A2 ⊕ · · ·Ak ⊕Ak+1)

=1

k + 1((kf(A1, A2, · · · , Ak)⊕Ak+1)) .

With the help of Eq.(12), we further obtain

f(A1, A2, · · · , Ak, Ak+1)

=1

k + 1

{

k{⟨

xi,⋃

r1∈hA1(xi),r2∈hA2(xi),··· ,rk∈hAk(xi)

{1−

k∏

j=1

(1− rj)1/k

}⟩|xi ∈ X}⊕Ak+1

}


=1

k + 1

{{⟨xi,

⋃

r1∈hA1 (xi),r2∈hA2 (xi),··· ,rk∈hAk(xi)

{1−

k∏

j=1

(1 − rj)}⟩|xi ∈ X

}⊕Ak+1

}

=1

k + 1

{⟨xi,

⋃

r1∈hA1(xi),r2∈hA2(xi),··· ,rk+1∈hAk+1(xi)

{1−

k∏

j=1

(1− rj)(1− rk+1)}⟩|xi ∈ X

}

=1

k + 1

{⟨xi,

⋃

r1∈hA1(xi),r2∈hA2(xi),··· ,rk+1∈hAk+1(xi)

{1−

k+1∏

j=1

(1− rj)}⟩|xi ∈ X

}

={⟨

xi,⋃

r1∈hA1 (xi),r2∈hA2 (xi),··· ,rk+1∈hAk+1(xi)

{1− [1− (1 −

k+1∏

j=1

(1 − rj))]1/k+1

}⟩|xi ∈ X}

={⟨

xi,⋃

r1∈hA1 (xi),r2∈hA2 (xi),··· ,rk+1∈hAk+1(xi)

{1−

k+1∏

j=1

(1− rj)1/k+1

}⟩|xi ∈ X},

i.e., Eq. (10) holds for m = k + 1.

This completes the proof of Eq. (10).

§4 Hierarchical hesitant fuzzy K-means clustering algorithm

The K-means algorithm is the most used iterative partitional clustering algorithm [20].

However, it is well-known that when using the algorithm to make clustering, first it needs to

assign the number of clusters. Then an initial partition of the data is provided and the centroids

of these initial clusters are calculated in advance. In the original K-means method, the initial

cluster is usually randomly selected and optimized by an iteration process. Obviously, such a

procedure (i.e. by a random way) arriving at the final solution may demand a great deal of

iterative times.

To fix this problem, many initializations methods [13,14,20] for the K-means algorithm have

been proposed. In particular, hierarchical structure is a very useful and widely adopted tech-

nique in information processing [14]. The output of hierarchical agglomerative clustering as an

initialization method for clustering algorithms (e.g., the EM algorithm) has been suggested and

further used in [19] for making clustering. It is clear that the suggested hierarchical technique

can enable one more easily to find better initial clusters as compared to that by random choice.

Moreover, the technique is very flexible because it can afford the corresponding results of ini-

tial clusters for the number of clusters that the K-means approach demands. Thus, it can be

matched well with the K-means approach. In this paper, we combine a hierarchical technique

with the K-means algorithm and employ it to make clustering analysis for HFSs. Specific de-

tails of our proposed algorithm are presented below.

Procedure I (Hierarchical agglomerative clustering)


Given a collection of m HFSs Hj(j = 1, 2, · · · ,m), we place each of the m HFSs in its own

cluster. The HFSs Hj(j = 1, 2, · · · ,m) are then compared with each other by using a distance

measure such as the hesitant weighted Hamming distance (Eq.(4)) or the hesitant weighted

Euclidean distance (Eq.(5)). Two clusters with the smallest distance are joined after which

they cannot be separated. Then we calculate the center of each cluster by using Eq.(10). The

same procedure is repeated over and over until all objects are in a single cluster. So the results

of agglomerative hierarchical clustering can be obtained from the procedure I.

Procedure II (K-means clustering)

Step 1. Give the number of clusters;

Step 2. Select the results of Procedure I for the corresponding number of clusters as initial

clusters, and calculate initial cluster centroids by Eq.(10);

Step 3. Calculate the distances between HFSs Hj(j = 1, 2, · · · ,m) and centroids by Eq.(4)

or Eq.(5); Assign Hj to the closest centroid;

Step 4. Recalculate the centroids of the clusters;

Step 5. Repeat Steps 2 and 3 until the centroids stabilize.

§5 Illustrative examples

Example 2. The example is to make a cluster of five kinds of marketing programs

Hj(j = 1, 2, · · · , 5) for new products put forward by an enterprise. Some experts evalu-

ate these programs from eight aspects on the basis of their familiar fields. These eight as-

pects are denoted by the feature space X = {x1, x2, · · · , x8}, and the weight vector is w =

(0.15, 0.10, 0.12, 0.15, 0.10, 0.13, 0.14, 0.11)T. Suppose various experts give different values for

a certain attribute of a program, we employ HFSs (for convenience, we also write them by

Hj(j = 1, 2, · · · , 5)) to represent the evaluated information over the five kinds of marketing

programs. The corresponding data are listed in Table 1.

Then we use the proposed methods to classify the five types of marketing programs.

Procedure I (Agglomerative hierarchical clustering)

Step 1. In this step, each of the HFSs Hj(j = 1, 2, · · · , 5) is considered as a unique cluster:

{H1}, {H2}, {H3}, {H4} and {H5}.

Step 2. Compare each of the HFSs Hj(j = 1, 2, · · · , 5) with all the other four HFSs using


Table 1: Hesitant fuzzy evaluated information

H1 H2 H3 H4 H5

x1 {0.2,0.3,0.5} {0.5,0.6} {0.45,0.5,0.65} {1} {0.9,0.95,1}x2 {0.1,0.2} {0.6,0.7,0.85} {0.6,0.7} {1} {0.9}x3 {0.5,0.6,0.7} {1} {0.9,0.95,1} {0.85,0.9} {0.8,0.85,0.9}x4 {0.9,0.95,1} {0.15,0.2,0.35} {0.1,0.15,0.2} {0.75,0.8,0.85} {0.7,0.75,0.8}x5 {0.4,0.5,0.65} {0,0.1,0.2} {0.2,0.3} {0.2} {0.5,0.6,0.85}x6 {0.1} {0.7,0.8,0.85} {0.6,0.7,0.8} {0.15} {0.3,0.35}x7 {0.3,0.4,0.5} {0.5,0.6,0.7} {0.15,0.2} {0.1,0.2,0.3} {0.15,0.2,0.25}x8 {1} {0.65,0.7,0.8} {0.2,0.3,0.35} {0.3} {0.4,0.5,0.7}

the hesitant weighted Hamming distance (Eq.(4)):

d(H1, H2) = d(H2, H1) = 0.4335, d(H1, H3) = d(H3, H1) = 0.4598

d(H1, H4) = d(H4, H1) = 0.3827, d(H1, H5) = d(H5, H1) = 0.3494

d(H2, H3) = d(H3, H2) = 0.1643, d(H2, H4) = d(H4, H2) = 0.3900

d(H2, H5) = d(H5, H2) = 0.3682, d(H3, H4) = d(H4, H3) = 0.3038

d(H3, H5) = d(H5, H3) = 0.3132, d(H4, H5) = d(H5, H4) = 0.1251.

Then

d(H1, H5) = min{d(H1, H2), d(H1, H3), d(H1, H4), d(H1, H5)} = 0.3494

d(H2, H3) = min{d(H2, H1), d(H2, H3), d(H2, H4), d(H2, H5)} = 0.1643

d(H4, H5) = min{d(H4, H1), d(H4, H2), d(H4, H3), d(H4, H5)} = 0.1251.

By considering that only two clusters can be jointed in each step, the HFSs Hj(j =

1, 2, · · · , 5) are thus clustered into the following four clusters: {H1}, {H2}, {H3} and {H4, H5}.

Step 3. Calculate the center of each cluster using Eq.(10):

c{H1} = H1, c{H2} = H2, c{H3} = H3

c{H4, H5} = f(H4, H5) = {< x1, {1} >,< x2, {1} >,< x3, {0.9, 0.8586, 0.8775, 0.85, 0.8268}>,

< x4, {0.8268, 0.8064, 0.7879, 0.8, 0.7551, 0.7764, 0.75, 0.7261}>,

< x5, {0.6536, 0.4343, 0.3695}>,< x6, {0.2567, 0.2286}>,

< x7, {0.2754, 0.2517, 0.2286, 0.2254, 0.2, 0.1754, 0.1784, 0.1515, 0.1254}>,

< x8, {0.5417, 0.4084, 0.3519}>}.Compare each cluster with the other three clusters with the hesitant weighted Hamming

distance (Eq.(4)):

d(H1, H2) = d(H2, H1) = 0.4335, d(H1, H3) = d(H3, H1) = 0.4598

d(H1, c{H4, H5}) = d(c{H4, H5}, H1) = 0.3450, d(H2, H3) = d(H3, H2) = 0.1643


d(H2, c{H4, H5}) = d(c{H4, H5}, H2) = 0.3889

d(H3, c{H4, H5}) = d(c{H4, H5}, H3) = 0.3211.

Then the HFSs Hj(j = 1, 2, · · · , 5) are clustered into the following three clusters: {H1},{H2, H3} and {H4, H5}.

Step 4. Calculate the center of each cluster by using Eq.(10):

c{H1} = H1,

c{H2, H3} = f(H2, H3) = {< x1, {0.6258, 0.5528, 0.5310, 0.5817, 0.5, 0.4756}>,

< x2, {0.7879, 0.7551, 0.7, 0.6536, 0.6}>,< x3, {1} >,

< x4, {0.2789, 0.2567, 0.2351, 0.2, 0.1754, 0.1515, 0.15, 0.1254}>,

< x5, {0.2517, 0.2, 0.2063, 0.1515, 0.1633, 0.1056}>,

< x6, {0.8268, 0.7879, 0.7551, 0.8, 0.7172, 0.7, 0.6536}>,

< x7, {0.5101, 0.4950, 0.4343, 0.4169, 0.3675, 0.3481}>,

< x8, {0.6394, 0.6258, 0.6, 0.5584, 0.5417, 0.5101, 0.5230, 0.5050, 0.4708}>},

c{H4, H5} = f(H4, H5) = {< x1, {1} >,< x2, {1} >,< x3, {0.9, 0.8586, 0.8775, 0.85, 0.8268}>,

< x4, {0.8268, 0.8064, 0.7879, 0.8, 0.7551, 0.7764, 0.75, 0.7261}>,

< x5, {0.6536, 0.4343, 0.3675}>,

< x6, {0.2567, 0.2286}>,

< x7, {0.2754, 0.2517, 0.2286, 0.2254, 0.2, 0.1754, 0.1784, 0.1515, 0.1254}>,

< x8, {0.5417, 0.4084, 0.3519}>}.Subsequently, we compare each cluster with the other two clusters by Eq.(4):

d(c{H1}, c{H2, H3}) = d(c{H2, H3}, c{H1}) = 0.4283

d(c{H1}, c{H4, H5}) = d(c{H4, H5}, c{H1}) = 0.3440

d(c{H2, H3}, c{H4, H5}) = d(c{H4, H5}, c{H2, H3}) = 0.3411.

Then the HFSs Hj(j = 1, 2, · · · , 5) can be clustered into the following two clusters: {H1} and

{H2, H3, H4, H5}.

Finally, the above two clusters are clustered into a unique cluster: {H1, H2, H3, H4, H5}.

The hierarchical representation of the results of the example is displayed in Fig.1.

Procedure II (K-means clustering)


Figure 1: The hierarchical representation of the results of the example 2.

To carry out K-means clustering, we choose the results of hierarchical clustering as initial

clusters. Since the results at K = 1 and K = 5 are unique, we shall illustrate our algorithm

with K = 2, 3, 4.

1) K = 4: Using the result obtained from hierarchical clustering {H1, H2, H3} and {H4, H5}as the initial cluster to compute the centroids and distances.

C1 = H1, C2 = H2, C3 = H3, C4 = f(H4, H5)

d(H1, C1) = 0, d(H1, C2) = 0.4335, d(H1, C3) = 0.4598, d(H1, C4) = 0.3450

d(H2, C1) = 0.4335, d(H2, C2) = 0, d(H2, C3) = 0.1643, d(H2, C4) = 0.3889

d(H3, C1) = 0.4598, d(H3, C2) = 0.1643, d(H3, C3) = 0, d(H3, C4) = 0.3211

d(H4, C1) = 0.3827, d(H4, C2) = 0.3900, d(H4, C3) = 0.3038, d(H4, C4) = 0.08209

d(H5, C1) = 0.3494, d(H5, C2) = 0.3682, d(H5, C3) = 0.3132, d(H5, C4) = 0.07412.

Based on the above distances, we get the classifications: {H1}, {H2}, {H3} and {H4, H5}.Since the center of each cluster is not changed, the iterative stops.

2)K = 3: Taking {H1}, {H2, H3} and {H4, H5} as the initial cluster, then the corresponding

results are:

C1 = H1, C2 = f(H2, H3), C3 = f(H4, H5)


d(H1, C1) = 0, d(H1, C2) = 0.4283, d(H1, C3) = 0.3450

d(H2, C1) = 0.4335, d(H2, C2) = 0.6408, d(H2, C3) = 0.3889

d(H3, C1) = 0.4598, d(H3, C2) = 0.1260, d(H3, C3) = 0.3211

d(H4, C1) = 0.3827, d(H4, C2) = 0.3478, d(H4, C3) = 0.08209

d(H5, C1) = 0.3494, d(H5, C2) = 0.3193, d(H5, C3) = 0.07412

and thus, we get classifications: {H1}, {H2, H3} and {H4, H5}. The center of each cluster

remains unchanged, we finish the iteration.

3) K = 2: Using {H1} and {H2, H3, H4, H5} as the initial cluster, and for this case, the

results are presented as follows:

C1 = H1, C2 = f(H2, H3, H4, H5)

d(H1, C1) = 0, d(H1, C2) = 0.4115, d(H2, C1) = 0.4335, d(H2, C2) = 0.2555,

d(H3, C1) = 0.4598, d(H3, C2) = 0.2612, d(H4, C1) = 0.3827, d(H4, C2) = 0.1634,

d(H5, C1) = 0.3494, d(H5, C2) = 0.1396.

Obviously, the classifications are {H1} and {H2, H3, H4, H5}. Again the center of each

cluster is not changed, so the iterative calculations are completed.

The above example indicates that by taking the results provided by hierarchical clustering

as initial cluster, it reduces the iterative number. That is to say, it can substantially raise the

iterative efficiency of K-means clustering as compared to the case of randomly initial values.

This favors to get the ideal clustering results quickly.

As Torra [26], Torra and Narukawa [27] have showed that the envelop of a HFE is just an

intuitionistic fuzzy value (IFV), one can transform the hesitant fuzzy information (Table 1) into

the intuitionistic fuzzy information shown in Table 2.

Table 2: Intuitionistic fuzzy information

H1 H2 H3 H4 H5

x1 (0.2,0.5) (0.5,0.4) (0.45,0.35) (1,0) (0.9,0)x2 (0.1,0.8) (0.6,0.15) (0.6,0.3) (1,0) (0.9,0.1)x3 (0.5,0.3) (1,0) (0.9,0) (0.85,0.1) (0.8,0.1)x4 (0.9,0) (0.15,0.65) (0.1,0.8) (0.75,0.15) (0.7,0.2)x5 (0.4,0.35) (0,0.8) (0.2,0.7) (0.2,0.8) (0.5,0.15)x6 (0.1,0.9) (0.7,0.15) (0.6,0.2) (0.15,0.85) (0.3,0.65)x7 (0.3,0.5) (0.5,0.3) (0.15,0.8) (0.1,0.7) (0.15,0.75)x8 (1,0) (0.65,0.2) (0.2,0.65) (0.3,0.7) (0.4,0.3)

It is worth pointing out here that Xu [31] have clustered for the data of Table 2 using

intuitionistic fuzzy clustering method, whose results are presented in Table 3.

We can see from Table 3 that the clustering results of HFSs, to a large extent, agree with


Table 3: Comparison of two different types of cluster methods

Classes Hierarchical hesitant fuzzy K-means Intuitionistic fuzzy hierarchicalclustering algorithm clustering algorithm

5 {H1},{H2},{H3},{H4},{H5} {H1},{H2},{H3},{H4},{H5}4 {H1},{H2},{H3},{H4, H5} {H1},{H2},{H3},{H4, H5}3 {H1},{H2, H3},{H4, H5} {H1},{H2, H3},{H4, H5}2 {H1},{H2, H3, H4, H5} {H1},{H2, H3, H4, H5}1 {H1, H2, H3, H4, H5} {H1, H2, H3, H4, H5}

those of IFSs, as the envelop of a HFE is just an IFV.

Example 3. Information on six aspects of five tourism resources is evaluated. The six as-

pects are scale, environmental conditions, integrity, service, tour routes and convenient traffic,

which are represented by the HFSs Hj(j = 1, 2, · · · , 5) in the feature space X = {x1, x2, · · · , x6

}. w = (1/6, 1/6, · · · , 1/6)T is the weight vector of xi(i = 1, 2, · · · , 6). The data are listed in

Table 4.

Table 4: Hesitant fuzzy information

H1 H2 H3 H4 H5

x1 {0.3,0.5} {0.6,0.7} {0.4,0.6} {0.2,0.6} {0.5,0.8}x2 {0.6,0.8,0.9} {0.5,0.6,0.8} {0.8,0.9} {0.4,0.5,0.9} {0.3,0.4}x3 {0.4,0.7} {0.6,0.8,0.9} {0.5,0.9} {0.9,1} {0.6,0.7}x4 {0.8,0.9} {0.7,0.9} {0.6,0.7,0.8} {0.8,0.9} {0.7,0.9}x5 {0.1,0.2,0.4} {0.3,0.4} {0.4,0.5} {0.2,0.5} {0.6,0.8}x6 {0.5,0.6} {0.4,0.7} {0.3,0.8} {0.7,0.9} {0.5,0.7}

Tourism department divides the five scenic areas into three categories.

Step 1. At this step, each of the HFSs Hj(j = 1, 2, · · · , 5) is considered as a unique cluster:

{H1}, {H2}, {H3}, {H4} and {H5}.

Step 2. Compare each of the HFSs Hj(j = 1, 2, · · · , 5) with all the other four HFSs by

using Eq.(6). We find that d(H2, H3) = min{d(Hi, Hj)}, i, j = 1, 2, · · · , 5 and i = j. Consid-

ering that only two clusters can be jointed in each stage, the HFSs Hj(j = 1, 2, · · · , 5) can be

clustered into the following four clusters at the second stage: {H1}, {H2, H3}, {H4} and {H5}.

Step 3. Calculate the center of each cluster by using Eq.(10), and then compare each cluster

with the other three clusters by using Eq.(6). Subsequently, the HFSs Hj(j = 1, 2, · · · , 5)(j =

1, 2, · · · , 5) can be clustered into the following three clusters at the third stage:


{H1, H4}, {H2, H3} and {H5}

Step 4. Select {H1, H4}, {H2, H3} and {H5} as initial cluster of K-means, and calculate

centroids of all clusters and their distances to each set:

C1 = f(H2, H3) =1

2(H2 ⊕H3) , C2 = f(H1, H4) =

1

2(H1 ⊕H4) , C3 = H5

d(H1, C1) = 0.1697 d(H1, C2) = 0.1469, d(H1, C3) = 0.2194

d(H2, C1) = 0.0977, d(H2, C2) = 0.1457, d(H2, C3) = 0.1556

d(H3, C1) = 0.1163, d(H3, C2) = 0.1598, d(H3, C3) = 0.2111

d(H4, C1) = 0.1816, d(H4, C2) = 0.1070, d(H4, C3) = 0.2361

d(H5, C1) = 0.1832, d(H5, C2) = 0.2300, d(H5, C3) = 0.

The new clusters obtained from the above distances are {H1, H4}, {H2, H3} and {H5}.Obviously the center of clusters is not changed, the iterative process stops.

To illustrate the effectiveness and stability of the hierarchical K- means clustering methods,

we made a simple test below.

Let K = 3 , instead of selecting hierarchical clustering results as the initial classification,

we randomly select {H1, H2, H3}, {H4} and {H5} as initial clusters, whose centroids are:

C1 = f(H1, H2, H3) =1

3(H1 ⊕H2 ⊕H3) , C2 = H4, C3 = H5

The distances between each set and Ci(i = 1, 2, 3) are:

d(H1, C1) = 0.1656 d(H1, C2) = 0.1639, d(H1, C3) = 0.2194

d(H2, C1) = 0.1170, d(H2, C2) = 0.1528, d(H2, C3) = 0.1556

d(H3, C1) = 0.1354, d(H3, C2) = 0.1778, d(H3, C3) = 0.2111

d(H4, C1) = 0.1889, d(H4, C2) = 0, d(H4, C3) = 0.2361

d(H5, C1) = 0.1806, d(H5, C2) = 0.2361, d(H5, C3) = 0.

Examining the above distances, one can see, except for H1, which belongs to the second

cluster, other sets are still the same as the initial cluster. Consequently, the classifications

become {H1, H4}, {H2, H3} and {H5}. This result means that we again return to Step 4. The

experiment clearly shows that using the results of hierarchical clustering as initial cluster in

K-means algorithm is more efficient than randomly choosing initial cluster, i.e., less iteration.

Besides, the initial choice does not affect the prediction of K-means clustering, indicating that

the presented clustering method is stable.

We can also transform hesitant fuzzy information (i.e. Example 3) into intuitionistic fuzzy

information through Definition 3. Table 5 compares the proposed method with Zhang et al.’s

method [40].

From Table 5, we can see that the two methods with classes 3 are a little different, and other

cases are completely the same. This is because the proposed method takes into account the

hesitant factors. From the view of mathematics, it is because as an interval-value, IFV contains

the HFE (which is a discrete value) through Torra’s definition of envelope. Appearance of


Table 5: Comparison of tourism scenic classification

Classes The present method Zhang et al.’s method5 {H1},{H2},{H3},{H4},{H5} {H1},{H2},{H3},{H4},{H5}4 {H1},{H2, H3},{H4},{H5}3 {H1, H4},{H2, H3},{H5} {H1, H2, H3},{H4},{H5}2 {H1, H2, H3, H4}, {H5}1 {H1, H2, H3, H4, H5} {H1, H2, H3, H4, H5}

the difference in certain circumstances is caused by the difference in the data type and in the

distribution of discrete values. This demonstrates the importance of HFSs clustering methods.

The results suggest that when one performs clustering for discrete hesitation fuzzy data, the

clustering method of HFSs should be applied, and it is not accurate to handle the data in the

form of IFSs.

§6 Conclusions

In this paper, on the basis of the operations of HFSs, we have made a careful investigation on

the clustering for HFSs with mixed clustering methods, namely, K-means clustering algorithm

in which initial clusters are provided by hierarchical agglomerative technique. Two actual

examples have been analyzed in details. They have shown that our approach is a valuable tool

to do clustering for HFSs. Furthermore, we have discussed the relationship between the results

of HFS clustering and those of IFS clustering, which has illustrated the importance of clustering

for HFSs.

References

[1] K Atanassov. Intuitionistic fuzzy sets, Fuzzy Sets and Systems, 1986, 20: 87-96.

[2] J C Bezdek. Pattern recognition with fuzzy objective function algorithms, Plenum, New York,

1981.

[3] G Bordogna, M Pagani, G Pasi. A dynamical Hierarchical fuzzy clustering algorithm for infor-

mation filtering, Stud Fuzziness Soft Comput, 2006, 197: 3-23.

[4] N Chen, Z S Xu, M M Xia. Correlation coefficients of hesitant fuzzy sets and their applications

to clustering analysis, Appl Math Model, 2013, 37: 2197-2211.

[5] D F Chen, Y J Lei, Y Tian. Clustering algorithm based on intuitionistic fuzzy equivalent relations,

J Air Force Eng Univ, 2007, 8: 63-66.

[6] Y H Dong, Y T Zhuang, K Chen, X Y Tai. A hierarchical clustering algorithm based on fuzzy

graph connectedness, Fuzzy Sets and Systems, 2006, 157: 1760-1774.

[7] D Dubois, H Prade. Fuzzy Sets and Systems: Theory and Applications, New York: Academic

Press, 1980.


[8] J L Fan, W Z Zhen, W X Xie. Suppressed fuzzy c-means clustering algorithm, Patt Recogn Lett,

2003, 24: 1607-1612.

[9] J Han, M Kamber. Data Mining: Concepts and Techniques, Morgan Kaufmann, San Mateo, CA,

2000.

[10] Z HilalInana, M Kuntalp. A study on fuzzy C-means clustering-based systems in automatic spike

detection, Comput Biol Med, 2007, 37: 1160-1166.

[11] C Hwang, F C H Rhee. Uncertain fuzzy clustering: Interval type-2 fuzzy approach to C-means,

IEEE T Fuzzy Syst, 2007, 15: 107-120.

[12] J T Jeng, C C Chuang, C W Tao. Interval competitive agglomeration clustering algorithm, Expert

Syst Appl, 2010, 37: 6567-6578.

[13] S S Khan, A Ahmad. Cluster center initialization algorithm for K-means clustering, Patt Recogn

Lett, 2004, 25: 1293-1302.

[14] J F Lu, J B Tang, Z M Tang, J Y Yang. Hierarchical initialization approach for K-means

clustering, Patt Recogn Lett, 2008, 29: 787-795.

[15] F Masulli, S Rovetta. Soft transition from probabilistic to possibilistic fuzzy clustering, IEEE T

Fuzzy Syst, 2006, 14: 516-527.

[16] S A Mingoti, J O Lima. Comparing SOM neural network with Fuzzy c-means, K-means and

traditional hierarchical clustering algorithms, European J Oper Res, 2006, 174: 1742-1759.

[17] S Miyamoto. Information clustering based on fuzzy multisets, Inform Process Manag, 2003, 39:

195-213.

[18] S Miyamoto. Remarks on basics of fuzzy sets and fuzzy multisets, Fuzzy Sets and Systems, 2005,

156: 427-431.

[19] M Meil, D Heckerman. An experimental comparison of several clustering and initialization meth-

ods, In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, Morgan

Kaufmann, San Francisco, CA, 1998, 386-395.

[20] J M Pena, J A Lozano, P Larranaga. An empirical comparison of four initialization methods for

the K-Means algorithm, Patt Recogn Lett, 1999, 20: 1027-1040.

[21] H F Pop, C Sarbu. The fuzzy hierarchical cross-clustering algorithm, improvements and compar-

ative study, J Chem Inf Comp Sci, 1997, 37: 510-516.

[22] L Rueda, Y Q Zhang. Geometric visualization of clusters obtained from fuzzy clustering algo-

rithms, Pattern Recogn, 2006, 39: 1415-1429.

[23] C Sarbu, K Zehl, J W Einax. Fuzzy divisive hierarchical clustering of soil data using Gustafson-

Kessel algorithm, Chemometer Intell Lab, 2007, 86: 121-129.

[24] Y Sun, Q M Zhu, Z X Chen. An iterative initial-points refinement algorithm for categorical data

clustering, Patt Recogn Lett, 2002, 23: 875-884.

[25] S Tokushige, H Yadohisa, K Inada. Crisp and fuzzy k-means clustering algorithms for multivari-

ate functional data, Comput Statist, 2007, 22: 1-16.

[26] V Torra. Hesitant Fuzzy Sets, Int J Intell Syst, 2010, 25: 529-539.


[27] V Torra, Y Narukawa. On hesitant fuzzy sets and decision, the 18th IEEE International Confer-

ence on Fuzzy Systems, Jeju Island, Korea, 2009, 1378-1382.

[28] V Torra, S Miyamoto, S Lanau. Exploration of textual document archives using a fuzzy hierar-

chical clustering algorithm in the GAMBAL system, Inform Process Manag, 2005, 41: 587-598.

[29] V Torra, S Miyamoto. On the consistency of a fuzzy C-means algorithm for multisets, Artif Intell

Res Dev, IOS Press, 2005, 289-295.

[30] M M Xia, Z S Xu. Hesitant fuzzy information aggregation in decision making, Internat J Approx

Reason, 2011, 52: 395-407.

[31] Z S Xu. Intuitionistic fuzzy hierarchical clustering algorithms, J Syst Eng Electron, 2009, 20:

1-5.

[32] Z S Xu, J J Wu. Intuitionistic fuzzy c-means clustering algorithms, J Syst Eng Electron, 2010,

21: 580-590.

[33] Z S Xu, J Chen, J J Wu. Clustering algorithm for intuitionistic fuzzy sets, Inform Sci, 2008, 178:

3775-3790.

[34] Z S Xu, M M Xia. Distance and similarity measures for hesitant fuzzy sets, Inform Sci, 2011,

181: 2128-2138.

[35] Z S Xu. Intuitionistic fuzzy aggregation operators, IEEE T Fuzzy Syst, 2007, 15: 1179-1187.

[36] R R Yager. On the theory of bags, Int J Gen Syst, 1986, 13: 23-37.

[37] M S Yang, D C Lin. On similarity and inclusion measures between type-2 fuzzy sets with an

application to clustering, Comput Math Appl, 2009, 57: 896-907.

[38] M S Yang, H M Shih. Cluster analysis based on fuzzy relations, Fuzzy Sets and Systems, 2001,

120: 197-212.

[39] L A Zadeh. Fuzzy sets, Inform Control, 1965, 8: 338-353.

[40] H M Zhang, Z S Xu, Q Chen. On clustering approach to intuitionistic fuzzy sets, Control Decis,

2007, 22: 882-888.

1 School of Economics and Management, Southeast University, Nanjing 211189, China.2 School of Applied Mathematics, Nanjing University of Finance and Economics, Nanjing 210023,

China. Email: chenna [email protected] Business School, Sichuan University, Chengdu 610064, China.

Email: [email protected] School of Economics and Management, Tsinghua University, Beijing 100084, China.

Email: [email protected]

hierarchical hesitant fuzzy k-means clustering algorithm

Documents