enhanced neural gas network for prototype-based clustering

Intelligent Database Systems Lab

國立雲林科技大學National Yunlin University of Science and Technology

Enhanced neural gas network for prototype-based clustering

Presenter : Shao-Wei Cheng

Authors : A.K. Qin, P.N. Suganthan

PR 2005


N.Y.U.S.T.

I. M.

2

Outline

Motivation

Objective

Methodology

Experiments and Results

Conclusion

Personal Comments


N.Y.U.S.T.

I. M.

3

Motivation

There are several problems about PBC and NG algorithm.

Sensitivity to initialization.

Sensitivity to input sequence ordering

The adverse influence from outliers.

.

.

.

.

.


N.Y.U.S.T.

I. M.

4

Objectives

Present an improved PBC algorithm based on the enhanced NG network framework, called the ENG.

Tackle several problems about PBC.

.

.

.

. .


N.Y.U.S.T.

I. M.

5

Methodology - original

PBC algorithms ： k-means, fuzzy k-means

NG network algorithms ： single-layered neural network Faster convergence to low distortion errors.

Lower distortion error than other methods.

Obeying a stochastic gradient descent.

The original NG algorithm

The original NG algorithm with concept of fuzzy

.

.

.

.

.

V.

0

1 2

34


N.Y.U.S.T.

I. M.Methodology - enhanced

Enhanced NG network framework (3) – Explain the influence of outlier, updating from Eq. (1)

(4) – The new formula updating from Eq. (3)

(5) 、 (6) 、 (7) – Explain the parameters in Eq. (4)


N.Y.U.S.T.

I. M.Methodology – MDL framework

MDL principle is employed as the performance measure.

Original MDL

MDL in this approach as


N.Y.U.S.T.

I. M.Methodology - processes

Initialize

(a) If m < Max_epoch

(b) If trainingset is not empty

(c) If training stage is at RP_epoch

(d) For j = 1 to size(V)

(e) For j = 1 to size(Torelocate

)

(f) If current utifactor value < previous utifactor value

Training epoch += 1 End

Draw data in training set

and compute Y

N

Y

N

Y

N

Y

N

change

restore

c synaptic weights W = {w1,w2, . . . ,wc} randomly

is the middle value for to control the acceleration of ’s changing

κ and η are the parameters used to calculate the MDL value

The initial training epoch number: m = 0

The initial iteration step number t in training epoch m : t = 1

Total iteration step number iter is ： iter = m · N + tThe maximum training epoch is set as Max_epoch

The dislocated prototypes’ relocation is defined as RP_epoch

The dataset for training is V = {v1, v2, . . . , vN}


N.Y.U.S.T.



N.Y.U.S.T.

I. M.

10

Methodology - processes


N.Y.U.S.T.



N.Y.U.S.T.

I. M.Experiments

Compared to 9 algorithms ： HCM, FCM, NG, FPCM, CFCM-F1, CFCM-F2, HRC-FRC, AHCM, and AFCM.

Data set： Artificial – D1, D2

UCI datasets

Run each clustering algorithms 10 times.

Parameter settings： εi =0.8, εf =0.05 ； λi =10, λf =0.01 ； βi =50, βm = 10, βf = 0.01

κ= 2, η= 1e − 4

Max_epoch = 10, RP_epoch = 5.


N.Y.U.S.T.

I. M.Experiments


N.Y.U.S.T.

I. M.

16

Conclusion

Tackle several problems about PBC Sensitivity to initialization.

Sensitivity to input sequence ordering

The adverse influence from outliers.

Experimental results have shown the superior performance of the proposed method over several existing PBC algorithms.

MDL framework can tackle the problem of compact clusters and sparse clusters simultaneously existed.


N.Y.U.S.T.

I. M.

17

Personal Comments

Advantage A heuristic way to tackle outlier problem.

Drawback

Application clustering

classification

enhanced neural gas network for prototype-based clustering

Documents