![Page 1: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/1.jpg)
C. Ding, NMF => Unsupervised Clustering 1
(Semi-)Nonnegative Matrix Factorization and
K-mean Clustering
Xiaofeng He Lawrence Berkeley Nat’l LabHorst Simon Lawrence Berkeley Nat’l LabTao Li Florida Int’l Univ.Michael Jordan UC BerkeleyHaesun Park Georgia Tech
Chris DingLawrence Berkeley National Laboratory
![Page 2: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/2.jpg)
C. Ding, NMF => Unsupervised Clustering 2
Nonnegative Matrix Factorization (NMF)
),,,( 21 nxxxX L=Data Matrix: n points in p-dim:
TFGX ≈Decomposition (low-rank approximation)
Nonnegative Matrices0,0,0 ≥≥≥ ijijij GFX
),,,( 21 kgggG L=),,,( 21 kfffF L=
is an image, document, webpage, etc
ix
![Page 3: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/3.jpg)
C. Ding, NMF => Unsupervised Clustering 3
Some historical notes
• Earlier work by statistics people (G. Golub)• P. Paatero (1994) Environmetrices• Lee and Seung (1999, 2000)
– Parts of whole (no cancellation)– A multiplicative update algorithm
![Page 4: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/4.jpg)
C. Ding, NMF => Unsupervised Clustering 4
0 0050 710
080 20 0
.
.
.
.
.
.
.
M
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
Pixel vector
![Page 5: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/5.jpg)
C. Ding, NMF => Unsupervised Clustering 5
XFGT ≈
),,,( 21 kfffF L= ),,,( 21 kgggG L=
Lee and Seung (1999): Parts-based Perspective
original
),,,( 21 nxxxX L=
![Page 6: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/6.jpg)
C. Ding, NMF => Unsupervised Clustering 6
TFGX ≈ ),,,( 21 kfffF L=
Straightforward NMF doesn’t get parts-based picture
Several People explicitly sparsify F to get parts-based picture
Donono & Stodden (2003) study condition for parts-of-whole
(Li, et al, 2001; Hoyer 2003)
“Parts of Whole” Picture
![Page 7: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/7.jpg)
C. Ding, NMF => Unsupervised Clustering 7
Meanwhile …….A number of studies empirically show the usefulness of NMF for pattern discovery/clustering:
Xu et al (SIGIR’03)Brunet et al (PNAS’04)Many others
We claim:
NMF factors give holistic pictures of the data
![Page 8: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/8.jpg)
C. Ding, NMF => Unsupervised Clustering 8
Our Experiments: NMF gives holistic pictures
![Page 9: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/9.jpg)
C. Ding, NMF => Unsupervised Clustering 9
Our Experiments: NMF gives holistic pictures
![Page 10: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/10.jpg)
C. Ding, NMF => Unsupervised Clustering 10
Task:Prove NMF is doing “Data Clustering”
NMF => K-means Clustering
![Page 11: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/11.jpg)
C. Ding, NMF => Unsupervised Clustering 11
NMF-Kmeans Theorem
2
0||X||min
0,
T
FFG
GIGTG
−≥=
≥
)(Trmin0,
XGXGXX TTT
GIGGT−
≥=
G -orthogonal NMF is equivalent to relaxed K-means clustering.
Proof.
(Ding, He, Simon, SDM 2005)
![Page 12: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/12.jpg)
C. Ding, NMF => Unsupervised Clustering 12
• Also called “isodata”, “vector quantization”• Developed in 1960’s (Lloyd, MacQueen, Hartigan, etc)
K-means clustering
• Computationally Efficient (order-mN)• Most widely used in practice
– Benchmark to evaluate other algorithms
∑∑∈=
−=kCi
ki
K
kK cxJ 2
1
||||min
TnxxxX ),,,( 21 L=Given n points in m-dim:
K-means objective
![Page 13: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/13.jpg)
C. Ding, NMF => Unsupervised Clustering 13
Reformulate K-means Clustering
∑ ∑ ∑=
∈−= i
K
kCji j
Ti
kiK k
xxn
xJ1
,2 1||||
}2/1/)00,11,00( k
T
n
k nhk
LLL=
Cluster membership indicators:
∑ ∑=
−=i
K
kk
TTkiK XhXhxJ
1
2
),,( 1 KhhH L=
)(Trmax0,
XHXH TT
HIHHT ≥=Solving K-mean =>
(Zha, Ding, Gu, He, Simon, NIPS 2001) (Ding & He, ICML 2004)
![Page 14: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/14.jpg)
C. Ding, NMF => Unsupervised Clustering 14
Reformulate K-means Clustering
Cluster membership indicators :
Hhhh ==
⎥⎥⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
),,(
1100000
0011100
0000011
321
C1 C2 C3
![Page 15: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/15.jpg)
C. Ding, NMF => Unsupervised Clustering 15
NMF-Kmeans Theorem
2
0||X||min
0,
T
FFG
GIGTG
−≥=
≥
)(Trmin0,
XGXGXX TTT
GIGGT−
≥=
G -orthogonal NMF is equivalent to relaxed K-means clustering.
Proof.
(Ding, He, Simon, SDM 2005)
![Page 16: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/16.jpg)
C. Ding, NMF => Unsupervised Clustering 16
Kernel K-means Clustering
Map feature vector to higher-dim space
∑∑∈=
−=kCi
ki
K
kK cxJ 2
1||)()(||min φφφ
Kernel K-means objective:
)( ii xx φ→
∑∈
≡kCi
ik
k xn
c )(1)( φφ
Kernal K-means optimization:
∑ ∑∑= ∈
−=K
k Cjij
Ti
kiiK
k
xxn
xJ1 ,
2 )()(1|)(| φφφφ
)(Tr)(),(1max1 ,
WHHxxn
J TK
k Cjiji
kK
k
== ∑ ∑= ∈
φφφ
Matrix of pairwise similarities
![Page 17: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/17.jpg)
C. Ding, NMF => Unsupervised Clustering 17
Symmetric NMF:
Symmetric NMF
Is Equivalence to )(Trmax0,
WHH T
HIHHT ≥=
2
0,||||min T
HIHHHHW
T−
≥=
THHW ≈
Orthogonal symmetric NMF is equivalent to Kernel K-means clustering.
Symmetric Nonnegative matrix
![Page 18: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/18.jpg)
C. Ding, NMF => Unsupervised Clustering 18
Orthogonality in NMF
Strict orthogonal G: hard clustering
Non-orthogonal G: soft clustering),( 21 hhH =
Ambiguous/outlier points
),,,( 21 nxxxX L=
![Page 19: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/19.jpg)
C. Ding, NMF => Unsupervised Clustering 19
K-means Clustering Theorem
2
0,||X||min T
GIGGGF
T +±±≥=−
G -orthogonal NMF is equivalent to relaxed K-means clustering.
(Ding, Li, Jordan, 2006)
Proof requires only G-orthogonality and nonnegativity
),,,( 21 kgggG L=
),,,( 21 kfffF L= => cluster centroids
=> cluster indicators
![Page 20: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/20.jpg)
C. Ding, NMF => Unsupervised Clustering 20
NMF Generalizations
SVD: TT VUGFX Σ== ±±±
TGFX +±± =Semi-NMF:
Tri-NMF:
Convex-NMF:
Kernel-NMF:
TGWXX ++±± =
TGSFX +±+± =
TGWXX ++±± = )()( φφ
(Ding, Li, Jordan, 2006)
(Ding, Li, Peng, Park, KDD 2006)
![Page 21: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/21.jpg)
C. Ding, NMF => Unsupervised Clustering 21
Semi-NMF:
• For any mixed-sign input data (centered data)• Clustrering and Low-rank approximation
TGFX +±± =
Update F:
Update G:
1)( −= GGXGF T
ikikT
ikikT
ikik FFGXFFFGXFGG
])([)(])([)(
+−
−+
++←
(Ding, Li, Jordan, 2006)
||||min TFGX −
![Page 22: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/22.jpg)
C. Ding, NMF => Unsupervised Clustering 22
In NMF TGFX +++ =TGFX +±± =In Semi-NMF
For fk factor to capture the notion of cluster centroid, Require fk to be a convex combination of input data
Convex-NMF
is in a large space
+=++= XWFxwxwf nnkk ,111 L
TGWXX ++±± =
(Ding, Li, Jordan, 2006)
),,,( 21 kfffF L=
For F interpretability, ±= XWF(Affine combination )
![Page 23: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/23.jpg)
C. Ding, NMF => Unsupervised Clustering 23
Convex-NMF:
||||min TXWGX −
Update F:
Update G:
ikTT
ikT
ikTT
ikT
ikik GWGXXGXXGWGXXGXXWW
])[(])[(])[(])[(
+−
−+
++←
ikTT
ikTT
ikTT
ikTT
ikik WXXGWXXWWXXGWXXW
GG])([])([])([])([
+−
−+
++
←
(Ding, Li, Jordan, 2006)
TGWXX ++±± =
Computing algorithm
![Page 24: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/24.jpg)
C. Ding, NMF => Unsupervised Clustering 24
Semi-NMF factors: Convex-NMF factors:
![Page 25: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/25.jpg)
C. Ding, NMF => Unsupervised Clustering 25
Semi-NMF factors: Convex-NMF factors:
![Page 26: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/26.jpg)
C. Ding, NMF => Unsupervised Clustering 26
![Page 27: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/27.jpg)
C. Ding, NMF => Unsupervised Clustering 27
- Sparse factorization is a recent trend.- Sparsity is usually explicitly enforced- Convex-NMF factors are naturally sparse
⎥⎥⎥⎥⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢⎢⎢⎢⎢
⎣
⎡
===
000001000001000001000001000001
),,( 1 keeWG L
From this we infer convex NMF factors are naturally sparse
Sparsity of Convex-NMF
2222 ||)(|||||||||| TTkk kXX
TF
T WGIvWGIGXWX T −=−=− ∑ σ
Consider 22 ||)(|||||| TTkk
T WGIeWGI −=− ∑Solution is
![Page 28: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/28.jpg)
C. Ding, NMF => Unsupervised Clustering 28
48476 1cluster
xxxxxxxx48476 2cluster
xxxxxxxx
A Simple Example
08.0|||| =− Kmeansconvex CF53.0|||| =− Kmeanssemi CF
30877.0,27944.0,27940.0|||| =− TFGXSVD Semi Convex
![Page 29: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/29.jpg)
C. Ding, NMF => Unsupervised Clustering 29
Experiments on 7 datasets
NMF variants always perform better than K-means
![Page 30: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/30.jpg)
C. Ding, NMF => Unsupervised Clustering 30
Kernel NMF -- Generalized Convex NMF
Map feature vector to higher-dim space
NMF/semi-NMF
)( ii xx φ→TFGX =)(φ
Minimization objective depends on kernel only:
)()(),()Tr(||)()(|| 2 TTT WGIXXGWIWGXX −−=− φφφφ
(Ding & He, ICML 2004)
)](,),(),([)( 21 nxxxX φφφφ L=
depends on the explicit mapping function )(•φ
TGWXX ])([)( φφ =Kernel NMF:
![Page 31: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/31.jpg)
C. Ding, NMF => Unsupervised Clustering 31
Kernel K-means Clustering
Map feature vector to higher-dim space
∑∑∈=
−=kCi
ki
K
kK cxJ 2
1||)()(||min φφφ
Kernel K-means objective:
)( ii xx φ→
∑∈
≡kCi
ik
k xn
c )(1)( φφ
Kernal K-means optimization:
∑ ∑∑= ∈
−=K
k Cjij
Ti
kiiK
k
xxn
xJ1 ,
2 )()(1|)(| φφφφ
)(Tr)(),(1max1 ,
WHHxxn
J TK
k Cjiji
kK
k
== ∑ ∑= ∈
φφφ
Matrix of pairwise similarities
![Page 32: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/32.jpg)
C. Ding, NMF => Unsupervised Clustering 32
NMF and PLSI : Equivalence
So far we only use the Frobenius norm as the NMF objective function. Another objective is the KL divergence
∑∑∈=
−=kCi
ki
K
kK cxJ 2
1||)()(||min φφφ
Kernel K-means objective:
)( ii xx φ→
∑∈
≡kCi
ik
k xn
c )(1)( φφ
Kernal K-means optimization:
∑ ∑∑= ∈
−=K
k Cjij
Ti
kiiK
k
xxn
xJ1 ,
2 )()(1|)(| φφφφ
)(Tr)(),(1max1 ,
WHHxxn
J TK
k Cjiji
kK
k
== ∑ ∑= ∈
φφφ
Matrix of pairwise similarities
![Page 33: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/33.jpg)
C. Ding, NMF => Unsupervised Clustering 33
ikTT
ikT
ikTT
ikT
ikik GWGXXGXXGWGXXGXXWW
])[(])[(])[(])[(
+−
−+
++←
Kernel-NMF Algorithm
Update F:
Update G:
ikTT
ikTT
ikTT
ikTT
ikik WXXGWXXWWXXGWXXW
GG])([])([])([])([
+−
−+
++
←
(Ding, Li, Jordan, 2006)
Computing algorithm depends only on the kernel
)(),( XX φφ
![Page 34: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/34.jpg)
C. Ding, NMF => Unsupervised Clustering 34
Orthogonal Nonnegative Tri-Factorization
2
0,||X||min
0,
T
FIFFGSF
GIGTG
T +±+±≥=−
≥=
3-factor NMF with explicit orthogonality constraints
Simultaneous K-means clustering of rows and columns
),,,( 21 kgggG L=
),,,( 21 kfffF L= => Row cluster indicators
=> Column cluster indicators(Ding, Li, Peng, Park, KDD 2006)
1. Solution is unique2. Can’t reduce to NMF
![Page 35: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/35.jpg)
C. Ding, NMF => Unsupervised Clustering 35
NMF-like algorithms are different ways to relax F , G !
),,(,,/ 11
knnkkk nndiagDXGDFnXgf L=== −
IGGGGXXGXGDXJ TTTnK =−=−= − ~~,||~~|||||| 221
2
1
2
1
2
1|||||||||||| T
n
ikiik
K
kCiki
K
kK FGXfxgfxJ
k
−=−=−= ∑∑∑∑==∈=
),,,( 21 kgggG L=
),,,( 21 kfffF L= = cluster centroids
= cluster indicators
),,,( 21 nxxxX L= = input data
K-means clustering objective function
![Page 36: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/36.jpg)
C. Ding, NMF => Unsupervised Clustering 36
NMF PLSINMF objective functions• Frobenius norm• KL-divergence: ij
Tij
n
jFG
xij
m
iKL FGxxJ
ijTij )(log
1)(
1+−= ∑∑
==
),(log),(11
j
n
jiji
m
iPLSI dwpdwxJ ∑∑
==
=
)|()()|(),( kjkk
kiji zdpzpzwpdwp ∑=
Probabilistic LSI (Hoffman, 1999) is a latent variable model for clustering:
constant+−= −KLNMFPLSI JJWe can show(Ding, Li, Peng, AAAI 2006)
![Page 37: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/37.jpg)
C. Ding, NMF => Unsupervised Clustering 37
Summary• NMF is doing K-means clustering (or PLSI)• Interpretability is key to motivate new NMF-
like factorizations– Semi-NMF, Convex-NMF, Kernel-NMF, Tri-NMF
• NMF-like algorithms always outperform K-means clustering
• Advantage: hard/soft clustering• Convex-NMF enforces notion of cluster centroids
and is naturally sparse
NMF: A new/rich paradigm for unsupervised learning
![Page 38: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/38.jpg)
C. Ding, NMF => Unsupervised Clustering 38
References
• On the Equivalence of Nonnegative Matrix Factorization and K-means /Spectral clustering, Chris Ding, XiaofengHe, Horst Simon, SDM 2005.
• Convex and Semi-Nonnegative Matrix Factorization, Chris Ding, Tao Li, Michael Jordan, submitted
• Orthogonal Non-negative Matrix Tri-Factorization for clustering, Chris Ding, Tao Li, Wei Peng, Haesun Park,KDD 2006.
• Nonnegative Matrix Factorization and Probabilistic Latent Semantic Indexing: Equivalence, Chi-square and a Hybrid Algorithm, Chris Ding, Tao Li, Wei Peng, AAAI 2006.
![Page 39: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/39.jpg)
C. Ding, NMF => Unsupervised Clustering 39
Data Clustering: NMF and PCA
2
0,||X||min T
GIGGGF
T +±±≥=−
NMF is useful due to nonnegativity.
G-orthogonality and nonnegativity
),,,( 21 kgggG L=
),,,( 21 kfffF L= => cluster centroids
=> cluster indicators
What happens if we ignore nonnegativity?
![Page 40: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/40.jpg)
C. Ding, NMF => Unsupervised Clustering 40
K-means clustering PCA
2
0,||))((X||min T
GIGGRGRF
T +±±≥=−
Ignore nonnegativity => orth. transform R
)]()([Trmax GRXXGR TT
GR
GRVFRUVUX T ==Σ= ,,
TTT UUFRFRFF == ))((
Equivelevant to
Solution is given by SVD:
Cluster indicator projection:
Centroid subspace projection:
TTT VVGRGRGG == ))((
PCA/SVD is automatically doing K-means clustering
(Ding & He, ICML 2004)
![Page 41: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/41.jpg)
C. Ding, NMF => Unsupervised Clustering 41
48476 1cluster
xxxxxxxx48476 2cluster
xxxxxxxx
A Simple Example
08.0|||| =− Kmeansconvex CF53.0|||| =− Kmeanssemi CF
30877.0,27944.0,27940.0|||| =− TFGXSVD Semi Convex
![Page 42: (Semi-)Nonnegative Matrix Factorization and K-mean Clustering](https://reader034.vdocument.in/reader034/viewer/2022050722/589adc161a28aba5468bae96/html5/thumbnails/42.jpg)
C. Ding, NMF => Unsupervised Clustering 42
NMF = Spectral Clustering (Normalized Cut)
Normalized Cut ⇒
cluster indicators:
))~((
)~()~(),,( 111
YWIY
yWIyyWIyyyJT
kTk
Tk
−=
−++−=
TrNcut LL
Re-write:}
||||/)00,11,00( 2/12/1k
Tn
k hDDyk
LLL=
IYYYW T =tosubject :Optimize ),~YTY
Tr(max
2/12/1~ −−= WDDW
2
0,||~||min T
HIHHHHW
T−
≥=
kTk
kTk
T
T
k DhhhWDh
DhhhWDhhhJ
)()(),,(11
111
−++−= LLNcut
(Gu , et al, 2001)