enhanced neural gas network for prototype-based clustering
DESCRIPTION
Enhanced neural gas network for prototype-based clustering. Presenter : Shao-Wei Cheng Authors : A.K. Qin, P.N. Suganthan. PR 2005. Outline. Motivation Objective Methodology Experiments and Results Conclusion Personal Comments. Motivation. - PowerPoint PPT PresentationTRANSCRIPT
Intelligent Database Systems Lab
國立雲林科技大學National Yunlin University of Science and Technology
Enhanced neural gas network for prototype-based clustering
Presenter : Shao-Wei Cheng
Authors : A.K. Qin, P.N. Suganthan
PR 2005
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
2
Outline
Motivation
Objective
Methodology
Experiments and Results
Conclusion
Personal Comments
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
3
Motivation
There are several problems about PBC and NG algorithm.
Sensitivity to initialization.
Sensitivity to input sequence ordering
The adverse influence from outliers.
.
.
.
.
.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
4
Objectives
Present an improved PBC algorithm based on the enhanced NG network framework, called the ENG.
Tackle several problems about PBC.
.
.
.
. .
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
5
Methodology - original
PBC algorithms : k-means, fuzzy k-means
NG network algorithms : single-layered neural network Faster convergence to low distortion errors.
Lower distortion error than other methods.
Obeying a stochastic gradient descent.
The original NG algorithm
The original NG algorithm with concept of fuzzy
.
.
.
.
.
V.
0
1 2
34
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology - enhanced
Enhanced NG network framework (3) – Explain the influence of outlier, updating from Eq. (1)
(4) – The new formula updating from Eq. (3)
(5) 、 (6) 、 (7) – Explain the parameters in Eq. (4)
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology – MDL framework
MDL principle is employed as the performance measure.
Original MDL
MDL in this approach as
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology - processes
Initialize
(a) If m < Max_epoch
(b) If trainingset is not empty
(c) If training stage is at RP_epoch
(d) For j = 1 to size(V)
(e) For j = 1 to size(Torelocate
)
(f) If current utifactor value < previous utifactor value
Training epoch += 1 End
Draw data in training set
and compute Y
N
Y
N
Y
N
Y
N
change
restore
c synaptic weights W = {w1,w2, . . . ,wc} randomly
is the middle value for to control the acceleration of ’s changing
κ and η are the parameters used to calculate the MDL value
The initial training epoch number: m = 0
The initial iteration step number t in training epoch m : t = 1
Total iteration step number iter is : iter = m · N + tThe maximum training epoch is set as Max_epoch
The dislocated prototypes’ relocation is defined as RP_epoch
The dataset for training is V = {v1, v2, . . . , vN}
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology - processes
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
10
Methodology - processes
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Methodology - processes
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
Compared to 9 algorithms : HCM, FCM, NG, FPCM, CFCM-F1, CFCM-F2, HRC-FRC, AHCM, and AFCM.
Data set: Artificial – D1, D2
UCI datasets
Run each clustering algorithms 10 times.
Parameter settings: εi =0.8, εf =0.05 ; λi =10, λf =0.01 ; βi =50, βm = 10, βf = 0.01
κ= 2, η= 1e − 4
Max_epoch = 10, RP_epoch = 5.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.Experiments
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
16
Conclusion
Tackle several problems about PBC Sensitivity to initialization.
Sensitivity to input sequence ordering
The adverse influence from outliers.
Experimental results have shown the superior performance of the proposed method over several existing PBC algorithms.
MDL framework can tackle the problem of compact clusters and sparse clusters simultaneously existed.
Intelligent Database Systems Lab
N.Y.U.S.T.
I. M.
17
Personal Comments
Advantage A heuristic way to tackle outlier problem.
Drawback
Application clustering
classification