the multilayer perceptron - Örebro...

108
Self-organizing systems

Upload: vanthien

Post on 01-Apr-2018

234 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Self-organizing systems

Page 2: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Self-organizing

• No supervisor available.• Instead, try to find order/structure

in the environment.–Clusters

(i.e.data is not homogeneously distributed)

–Directions (i.e. projections that carry more information than others)

Page 3: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Finding projections

Page 4: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Auto-encoder MLP

Output = input

Input

The auto-encoder is trained to reproduce the output through a“bottle neck” - must try to find an efficient coding. Train using standard training algorithms.

Will lead to ~ principal components.

Page 5: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

2-1-2 MLPwith linearoutput butnonlinear input, trainedto reproduce the input data.

The line showsthe directionof w, the weightvector for thehidden unit.

Page 6: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Principal components(Karhounen-Loeve transform)

Task: Find a linear recoding of the data that preserver as much information as possible (”information” = variance in the signal)

⇒ Principal components

Page 7: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Principal components

LMM

LM

+

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

+

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=+++≡

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

0

10

0

01

2122112

1

xxxxx

x

xx

DD

D

eeex

Find a new ON basis Q with M < D

zqqqxx =

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=+++=≈

M

MM

z

zz

zzzM

L 2

1

2211ˆ

[ ]∑ −n

nn 2)(ˆ)( xxSuch that is minimized

Page 8: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 9: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Reminder on change of basis

iT

iz qx=

The coefficient zi is given by the scalar productof x and the basis vector qi.

Page 10: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Suppose M = D-1

[ ] [ ]

( )

Dn

DDD

D

D

TD

Dn

TTD

nD

TTD

nD

T

nD

nDD

n

nxnxnxnxnx

nxnxnxnxnxnxnxnxnxnx

nn

nnnz

znnnn

qq

qxxq

qxxqqx

qxxxx

∑∑∑

∑∑

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=⎥⎦

⎤⎢⎣

===

=+−=−

)()()()()(

)()()()()()()()()()(

)()(

)()()(

)()()(ˆ)(

221

22212

12121

22

22

L

MOMM

L

L

Page 11: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

[ ] DTD

nNnn Rqqxx =−∑ 2)(ˆ)(

where

∑⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=n

DDD

D

D

nxnxnxnxnx

nxnxnxnxnxnxnxnxnxnx

N)()()()()(

)()()()()()()()()()(

1

221

22212

12121

L

MOMM

L

L

R

is the correlation matrix

Page 12: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

We can guarantee minimum loss if we chooseqD to be the eigenvector of R with minimumeigenvalue

DDD qRq λ=

[ ] DDTD

nNNnn λ==−∑ Rqqxx 2)(ˆ)(

A zero eigenvalue means no loss.

Page 13: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Principal components

• Choose new basis Q of ON eigenvectors of the correlation matrix R.

• Discard basis vectors in increasing order of their eigenvalues (i.e. throw away smallest eigenvalues first)

• Can also be done with the eigenvectors of the covariance matrix Σ. (Identical to the correlation matrix if data has zero mean.)

Page 14: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Covariance matrix

⎟⎟⎟⎟⎟

⎜⎜⎜⎜⎜

=

DDDD

D

D

σσσ

σσσσσσ

L

MOMM

L

L

21

22221

11211

Σ

[ ][ ]∑ −−−

=n

jjiiij nxnxN

µµσ )()(1

1

∑=n

ii nxN

)(1µ [ ]∑ −−

==n

iiiii nxN

22 )(1

1 µσσ

Page 15: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Covariance matrix

[ ][ ]

jn

iji

jijn

iji

njjiiij

NNnxnx

N

NNnxnx

N

nxnxN

µµ

µµµµ

µµσ

−+

−−

=+−

−−

=−−−

=

11)()(

11

12)()(

11

)()(1

1

TµµRΣ −≈

Page 16: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Principal components

iii qq λ=ΣExpress the data in the new basis Q, with basis vectorsthat are eigenvectors of the data covariance matrix.

-10 -5 0 5 10-8

-6

-4

-2

0

2

4

6

8

x1

x 2

The red lines showthe directions of thetwo eigenvectors.

Page 17: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Variance along eigendirection(Zero mean data)

[ ]

[ ] kn

kT

nkk

Tk

NNn

N

nN

λ

µσ

1)(

11

)(1

1

2

22

−=

=−−

=

qx

qx

-10 -5 0 5 10-8

-6

-4

-2

0

2

4

6

8

x1

x 2

Page 18: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA example: NIR spectra of meat

0 50 1002

2.5

3

3.5

4

4.520 NIR spectra

0 50 100-1

-0.5

0

0.5

1Same 20 demeaned

0 50 100-2

-1

0

1

2...and rescaled

0 50 100-3

-2

-1

0

1

2Grouped in fat%

Each curve is a pointin 100 dimensionalspace.

Page 19: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

NIR: The 9 leading eigenvectors

0 50 100-0.5

0

0.5

0 50 100-0.5

0

0.5

0 50 100-0.5

0

0.5

0 50 100-0.2

0

0.2

0 50 100-0.5

0

0.5

0 50 100-0.5

0

0.5

0 50 100-0.5

0

0.5

0 50 100-0.5

0

0.5

0 50 100-0.5

0

0.5

λi = 2.4308, 0.9372, 0.0489, 0.0256, 0.0108, 0.0023, 0.0014, 0.0002, 0.0001

Page 20: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

NIR reconstruction with PCA

0 50 100-2

0

2

0 50 100-2

0

21:st PC

0 50 100-2

0

21,2 PC

0 50 100-2

0

21,2,3 PC

0 50 100-2

0

21,2,3,4 PC

0 50 100-2

0

21-5 PC

0 50 100-2

0

21-6 PC

0 50 100-2

0

21-7 PC

0 50 100-2

0

21-8 PC

z = (2.64, 0.01, 2.35, -9.24, 0.66, -0.23, 0.71, -0.34, 0.06)

Page 21: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

The first eigenvectorfor the Legodatacovariancematrix.

The line showsthe directionof the eigen-vector.

= The firstprincipaldirection.

Page 22: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

In this case,the firstprincipaldirectionis goodfor doingtheclassification

Page 23: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

2-1-2 MLPautoencoderwith linearoutput butnonlinear input, trainedto reproduce the input data.

The line showsthe directionof w, the weightvector for thehidden unit.

Page 24: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA application: image compression

Original image

Page 25: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA (KL) basisestimated from 12x12patches (144 dim).

Page 26: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Recoded image using 10% of PCA basis for each 12x12 patch.

Page 27: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Recoded image using 50% of PCA basis for each 12x12 patch.

Page 28: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Original image

Page 29: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA application: Eigenfaces• Images are high-

dimensional data with high correlation (faces look quite similar after all...the eyes are located above the nose, the mouth below the nose, hair on top...etc.)

• Reduce the dimensionality of the face image database by using PCA.

• Requires that the face is centered in the image and that the individual is looking into the camera (i.e. Same pose all the time).

M. Turk, A. Pentland, "Eigenfaces for Recognition", Journal of Cognitive Neuroscience, vol. 3, no. 1, pp. 71-86, 1991.

Page 30: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Images from the ORL database (http://www.cam-orl.co.uk/facedatabase.html)

Large λ

Medium λ

Small λ

Eigenvectors (”eigenfaces”) when different subsetsof 200 face images are used to compute PCA

Page 31: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

• You need only 10-20 eigenfaces to do a reliable identification.

• Compare with dimension of original image.

http://cnx.org/

Page 32: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA not always an optimal projection for classification

Page 33: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Auto-encoder applications

Output = input

Input

• Induction motor failure detection (Siemens). Input: Power spectrum of electrical current.

• Failure prediction in helicopter gear boxes (US Navy). Input: Vibration spectrum of gear box.

• Bank note rejection (and acceptance) at automatic vending machines. (U. Firenze)Input: Reflected and transmitted light along bank note.

Page 34: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 35: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA ≠ Autoencoder

The PCA basis can represent data in a subspace that extendsinfinitely.

The MLP autoencoder reliablyrepresents data in a lower dimensional subspace and in alimited region. This is due tosigmoid functions that saturate.

Page 36: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Nonlinear autoencoder

Output = input

Has been very difficult to train.Now ”solved” by using smart”pretraining” (a ”Boltzmannmachine”).

Matlab code available athttp://www.cs.toronto.edu/~hinton/MatlabForSciencePaper.html

Input

Hinton & Salakhutdinov, Science, 313, pp. 504-507, 2006

Page 37: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Nonlinear autoencoderOriginal

6-dim nonlin. autoencoder

6-dim lin. autoencoder

6-dim linear PCA

Original

30-dim nonl.autoencoder

30-dim PCA

Hinton & Salakhutdinov, Science, 313, pp. 504-507, 2006

Page 38: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Visualization of newswire stories2D nonlinear autoenc. 2D latent semantic analysis

Hinton & Salakhutdinov, Science, 313, pp. 504-507, 2006

Page 39: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

PCA with kernels (cf. SVM)Map to high-dim.Space and comp.PCA there.

Can be done withkernels.

Figure from ftp://ftp.research.microsoft.com/users/mtipping/skpca_nips.ps.gz

Page 40: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

ICA

• ICA = Independent component analysis

• PCA computes eigenvectors of covariance matrix (2:nd order statistics)

• ICA looks at higher order statistics and finds ”independent”components.

Page 41: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Clustering

Page 42: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

k-means clustering

For K “cluster vectors” minimize

[ ] 2

1 1

||)(||)(21

k

N

n

K

kk nnE wxx −Λ= ∑∑

= = The ”distortion”

[ ]⎩⎨⎧

=Λotherwise0

)( closest to is if1)(

nn k

k

xwx

Λ is an ”assignment function”

Page 43: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

k-means update

[ ][ ]∑=

−Λ+=+N

nkkkk nntt

1

)()()()1( wxxww η

⎩⎨⎧ +−

=+otherwise)(

closest for )()()1()1(

tnt

tk

kkk w

wxww

ηη

k-means can be done in batch and on-line mode.Often on-line.

Page 44: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 45: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 46: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 47: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 48: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 49: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 50: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 51: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 52: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 53: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 54: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 55: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

TrainE=0.56%TestE=0.80%

But the alg.wasn’t toldabout red &green.

Page 56: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 57: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 58: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 59: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Takes a longtime to getvectors w toconvergeinto regionof interest.

Page 60: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

“Better” to pickinitial pointsrandomlyfrom data.

Page 61: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 62: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

TrainE=0.55%TestE=0.52%

But the alg.needs to knowabout red &green.

Page 63: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

k-means problem

• How select the number of centers?

• Common to minimize Schwarz criterion:

[ ] [ ] )log(),()(21

1 1

NDKndnE k

N

n

K

kk λ+Λ= ∑∑

= =

wxx

Distortion Complexity cost

Page 64: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Learning vector quantization

For correctly classified patterns – move closer:

⎩⎨⎧ +−

=+otherwise)(

closest for )()()1()1(

tnt

tk

kkk w

wxww

ηη

For incorrectly classified patterns – move away:

⎩⎨⎧ −−

=+otherwise)(

closest for )()()1()1(

tnt

tk

kkk w

wxww

ηη

Page 65: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Self-organizing maps

• Impose a topology among the “neurons”, i.e. define neighborhood relationships.

• Update neighbors along with closest unit.

• Will encode the data in a 2D or 3D submanifold.

Page 66: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

A 2D square lattice topology

Every neuronhas 4 nearneighbors.

Page 67: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

A 2D hexagonal lattice topology

Every neuronhas 6 nearneighbors.

Page 68: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

SOM maps

For K “cluster vectors” (neurons) minimize

[ ] 2

1 1

||)(||)(21

k

N

n

K

kk nnE wxx −Λ= ∑∑

= =

Example of switch

[ ]⎪⎩

⎪⎨⎧

=Λotherwise0

unitclosest oneighbor t is ifor )( closest to is if1)( k

k

k

nn w

xwx

Page 69: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

SOM update

Let the closest unit to x(n) be called unit j.

)()()1()1( ntt jkkjkk xww Λ+Λ−=+ ηη

⎥⎦

⎤⎢⎣

⎡−=Λ 22

expσ

jkjk

d djk is distance in latticeσ is decreased with time

Page 70: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

First, bigneighborhood

Page 71: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Then, smallerneighborhood

Page 72: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Then, noneighborhood

Page 73: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 74: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 75: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 76: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 77: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 78: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 79: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 80: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 81: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 82: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 83: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 84: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 85: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 86: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 87: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Initial

Page 88: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

5 epochs

Page 89: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

10 epochs

Page 90: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

15 epochs

Page 91: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

20 epochs

Page 92: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

SOM only

Page 93: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 94: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 95: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Hierarchical clustering

• Agglomerative: Start out with all points as individual clusters. Join closest clusters until you’re satisfied.

Page 96: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 97: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 98: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 99: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 100: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start
Page 101: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Hierarchical clustering

Page 102: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Hierarchical clustering

Page 103: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Hierarchical clustering

Page 104: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Hierarchical clustering

Clustering orderand distances

Dendrogram

Page 105: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

k-means

Page 106: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

k-means

Page 107: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

k-means

Page 108: The multilayer perceptron - Örebro Universityaass.oru.se/~lilien/ml/lectures/ML_2007_Lecture_6.pdf · Matlab code available at ... Hierarchical clustering • Agglomerative: Start

Metrics

( )

( )

[ ]

( ) ( )),(),(||||),(

sgn),(

||||),(

||||),(

1

/1

2/12

wxwxwxΣwxwxwx

wx

wxwx

wxwx

Kdd

wxd

wxd

wxd

Tk

kk

p

k

pkkp

kkk

=−−=−=

−=

⎥⎦

⎤⎢⎣

⎡−=−=

⎥⎦

⎤⎢⎣

⎡−=−=

−Σ

∑Euclidean

Minkowski

Manhattan

Mahalanobis

Kernel

etc...mutations, alignments,...whatever...