signal classification by matching node connectivitiessaito/confs/ssp09/naoki... · 2009. 9. 15. ·...

53
Signal Classification by Matching Node Connectivities 1 Linh Lieu and Naoki Saito Department of Mathematics University of California Davis, CA 95616 USA IEEE Workshop on Statistical Signal Processing Cardiff, Wales, UK September 1, 2009 1 Partially supported by NSF and ONR grants. [email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 1 / 28

Upload: others

Post on 15-Jun-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Signal Classification by Matching Node Connectivities1

Linh Lieu and Naoki Saito

Department of MathematicsUniversity of CaliforniaDavis, CA 95616 USA

IEEE Workshop on Statistical Signal ProcessingCardiff, Wales, UKSeptember 1, 2009

1Partially supported by NSF and ONR [email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 1 / 28

Page 2: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Outline

Outline

1 IntroductionA classification problem

2 Diffusion FrameworkBasics in the Diffusion FrameworkPractical Considerations

3 Node Connectivities MatchingSet up

4 Numerical Experiments and ResultsSynthetic DataHyperspectral Data

5 Conclusion

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 2 / 28

Page 3: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction

Outline

1 Introduction

2 Diffusion Framework

3 Node Connectivities Matching

4 Numerical Experiments and Results

5 Conclusion

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 3 / 28

Page 4: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Given

Training Data: X = {x1, · · · , xN1} ⊂ Rn.Each data point xj has a known class label ∈ {C1, · · · ,CK}.

Unlabeled (or Test) Data: Y = {y1, · · · , yN2} ⊂ Rn.

Objective

Find a class label among {C1, · · · ,CK} for each y1, · · · , yN2.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 4 / 28

Page 5: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Given

Training Data: X = {x1, · · · , xN1} ⊂ Rn.Each data point xj has a known class label ∈ {C1, · · · ,CK}.Unlabeled (or Test) Data: Y = {y1, · · · , yN2

} ⊂ Rn.

Objective

Find a class label among {C1, · · · ,CK} for each y1, · · · , yN2.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 4 / 28

Page 6: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Given

Training Data: X = {x1, · · · , xN1} ⊂ Rn.Each data point xj has a known class label ∈ {C1, · · · ,CK}.Unlabeled (or Test) Data: Y = {y1, · · · , yN2

} ⊂ Rn.

Objective

Find a class label among {C1, · · · ,CK} for each y1, · · · , yN2.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 4 / 28

Page 7: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Our Approach

Following the Diffusion Framework:

Construct a similarity graph from the training data X , then expandthe graph to the unlabeled data Y .

For each node, compute a histogram of node connectivity (distributionof its similarity to all the nodes), and let hxj and hyk

denote thehistogram corresponding to xj ∈ X and yk ∈ Y , respectively.

Compare hxj and hykusing appropriate distance measure d(·, ·).

Infer the label of xj∗ to yk if

xj∗ = arg minxj∈X

d(hxj ,hyk)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 5 / 28

Page 8: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Our Approach

Following the Diffusion Framework:

Construct a similarity graph from the training data X , then expandthe graph to the unlabeled data Y .

For each node, compute a histogram of node connectivity (distributionof its similarity to all the nodes), and let hxj and hyk

denote thehistogram corresponding to xj ∈ X and yk ∈ Y , respectively.

Compare hxj and hykusing appropriate distance measure d(·, ·).

Infer the label of xj∗ to yk if

xj∗ = arg minxj∈X

d(hxj ,hyk)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 5 / 28

Page 9: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Our Approach

Following the Diffusion Framework:

Construct a similarity graph from the training data X , then expandthe graph to the unlabeled data Y .

For each node, compute a histogram of node connectivity (distributionof its similarity to all the nodes), and let hxj and hyk

denote thehistogram corresponding to xj ∈ X and yk ∈ Y , respectively.

Compare hxj and hykusing appropriate distance measure d(·, ·).

Infer the label of xj∗ to yk if

xj∗ = arg minxj∈X

d(hxj ,hyk)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 5 / 28

Page 10: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Our Approach

Following the Diffusion Framework:

Construct a similarity graph from the training data X , then expandthe graph to the unlabeled data Y .

For each node, compute a histogram of node connectivity (distributionof its similarity to all the nodes), and let hxj and hyk

denote thehistogram corresponding to xj ∈ X and yk ∈ Y , respectively.

Compare hxj and hykusing appropriate distance measure d(·, ·).

Infer the label of xj∗ to yk if

xj∗ = arg minxj∈X

d(hxj ,hyk)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 5 / 28

Page 11: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Introduction A classification problem

Our Approach

Following the Diffusion Framework:

Construct a similarity graph from the training data X , then expandthe graph to the unlabeled data Y .

For each node, compute a histogram of node connectivity (distributionof its similarity to all the nodes), and let hxj and hyk

denote thehistogram corresponding to xj ∈ X and yk ∈ Y , respectively.

Compare hxj and hykusing appropriate distance measure d(·, ·).

Infer the label of xj∗ to yk if

xj∗ = arg minxj∈X

d(hxj ,hyk)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 5 / 28

Page 12: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework

Outline

1 Introduction

2 Diffusion Framework

3 Node Connectivities Matching

4 Numerical Experiments and Results

5 Conclusion

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 6 / 28

Page 13: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

First, build a connected Similarity Graph from the given data Z = X∪Y .

Γ•z1

•z2 •z3

•z4

W12

������� W13

???????

W23 W34MMMMMM

W14

W24

Gaussian Weights: Wij∆= e−‖zi−zj‖2/ε2

, ε > 0.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 7 / 28

Page 14: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Next, do Graph-Laplacian normalization to get the diffusion matrix:

Let D be the diagonal matrix:

Dii∆=

N1+N2∑k=1

e−‖zi−zk‖2/ε2, i = 1, 2, · · · ,N1 + N2.

where Dii is the degree of node zi .

The diffusion matrix (size (N1 + N2)× (N1 + N2)) is:

P∆= D−1W

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 8 / 28

Page 15: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 16: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 17: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 18: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 19: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 20: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 21: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix

P is non-negative and row-stochastic (∑

k Pik = 1).

P represents a transition matrix of a Markov process on Γ.Pij = probability of moving from zi to zj in one step.

Spectrum of P: 1 = λ0 > λ1 ≥ λ2 ≥ · · · ≥ λN1+N2 ≥ 0.

P has spectral decomposition:

Pij =∑k

λkφk(i)ψk(j),

whereB {φk} and {ψk} are (orthonormal) left and right eigenvectors,B φk(i) = the ith entry of the vector φk .

Markov process can be forwarded in time t ∈ N withPt

ij =∑

k λtkφk(i)ψk(j) = prob. of moving from zi to zj in t steps.

Markov process has stationary distribution: π∆= D1

1T D1

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 9 / 28

Page 22: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix (cont.)

The ith row Pi · can be viewed as a distribution of connectivity(degree) of zi to all other nodes.

Definition of the diffusion distance

With t ∈ N given a priori, the diffusion distance Dt(zi , zj) at time t isdefined as:

Dt(zi , zj)2 ∆

=∥∥Pt

i · − Ptj ·∥∥2

L2(X , 1π

)

=∑`

(Pt

i` − Ptj`

)2

π(`)

=∑`

λ2t` (ψ`(i)−ψ`(j))2 .

(1)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 10 / 28

Page 23: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Properties of the diffusion matrix (cont.)

The ith row Pi · can be viewed as a distribution of connectivity(degree) of zi to all other nodes.

Definition of the diffusion distance

With t ∈ N given a priori, the diffusion distance Dt(zi , zj) at time t isdefined as:

Dt(zi , zj)2 ∆

=∥∥Pt

i · − Ptj ·∥∥2

L2(X , 1π

)

=∑`

(Pt

i` − Ptj`

)2

π(`)

=∑`

λ2t` (ψ`(i)−ψ`(j))2 .

(1)

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 10 / 28

Page 24: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Advantages of the diffusion distance Dt(zi , zj)

preserves local neighborhood;

measures the difference of how zi and zj are connected to all othernodes in Γ;

takes into account all incidences relating zi and zj ;

is robust to noise.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 11 / 28

Page 25: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Approximation of the diffusion distance

Since eigenvalues λ` are decreasing to 0, diffusion distance can beapproximated to a chosen accuracy τ > 0:

Dt(zi , zj)2 ≈

s(τ,t)∑`=1

λ2t` (ψ`(i)−ψ`(j))2 ,

for some s(τ, t) ∈ N.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 12 / 28

Page 26: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Diffusion Map

The Diffusion Map Ψt : Z → Rs(τ,t) is defined by

Ψt : zi 7→(λt

1ψ1(i), λt2ψ2(i), · · · , λt

s(τ,t)ψs(τ,t)(i))T

.

Ψt embeds Z into a low-dimensional diffusion space, s(τ, t)� n.

Dt(zi , zj) ≈ ‖Ψt(zi )−Ψt(zj)‖, diffusion distance is approximated byEuclidean distance within the diffusion space.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 13 / 28

Page 27: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Diffusion Map

The Diffusion Map Ψt : Z → Rs(τ,t) is defined by

Ψt : zi 7→(λt

1ψ1(i), λt2ψ2(i), · · · , λt

s(τ,t)ψs(τ,t)(i))T

.

Ψt embeds Z into a low-dimensional diffusion space, s(τ, t)� n.

Dt(zi , zj) ≈ ‖Ψt(zi )−Ψt(zj)‖, diffusion distance is approximated byEuclidean distance within the diffusion space.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 13 / 28

Page 28: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Basics in the Diffusion Framework

Diffusion Map

The Diffusion Map Ψt : Z → Rs(τ,t) is defined by

Ψt : zi 7→(λt

1ψ1(i), λt2ψ2(i), · · · , λt

s(τ,t)ψs(τ,t)(i))T

.

Ψt embeds Z into a low-dimensional diffusion space, s(τ, t)� n.

Dt(zi , zj) ≈ ‖Ψt(zi )−Ψt(zj)‖, diffusion distance is approximated byEuclidean distance within the diffusion space.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 13 / 28

Page 29: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Diffusion Framework Practical Considerations

In practice for signal classification problems

We do not compute diffusion maps on Z = X ∪ Y .

B Compute the similarity graph Γ only from the training data X .⇒ diffusion maps are defined only for X , Ψt : X → Rs(τ,t).

B Extend Ψt to Y (out-of-sample extension) using

Geometric harmonics multiscale extension scheme(Lafon-Keller-Coifman, 2006); orNystrom extension (Fowlkes-Belongie-Chung-Malik, 2004).

⇒ After which Ψt : X ∪ Y → Rs(τ,t).

B Diffusion distance between xi ∈ X and yj ∈ Y is approximatelyDt(xi , yj) ≈ ‖Ψt(xi )−Ψt(yj)‖.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 14 / 28

Page 30: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching

Outline

1 Introduction

2 Diffusion Framework

3 Node Connectivities Matching

4 Numerical Experiments and Results

5 Conclusion

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 15 / 28

Page 31: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching

Our Proposed Method: Node Connectivities Matching

A modification to the diffusion distance approach.

No computation of eigenvalues/eigenvectors.

Bypass the out-of-sample extension step, hence avoid error admittedduring the extension process.

Still close to the diffusion distance, hence inherits nicelocal-neighborhood preserving property from the diffusion distance.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 16 / 28

Page 32: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching

Our Proposed Method: Node Connectivities Matching

A modification to the diffusion distance approach.

No computation of eigenvalues/eigenvectors.

Bypass the out-of-sample extension step, hence avoid error admittedduring the extension process.

Still close to the diffusion distance, hence inherits nicelocal-neighborhood preserving property from the diffusion distance.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 16 / 28

Page 33: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching

Our Proposed Method: Node Connectivities Matching

A modification to the diffusion distance approach.

No computation of eigenvalues/eigenvectors.

Bypass the out-of-sample extension step, hence avoid error admittedduring the extension process.

Still close to the diffusion distance, hence inherits nicelocal-neighborhood preserving property from the diffusion distance.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 16 / 28

Page 34: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching

Our Proposed Method: Node Connectivities Matching

A modification to the diffusion distance approach.

No computation of eigenvalues/eigenvectors.

Bypass the out-of-sample extension step, hence avoid error admittedduring the extension process.

Still close to the diffusion distance, hence inherits nicelocal-neighborhood preserving property from the diffusion distance.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 16 / 28

Page 35: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching Set up

Set up a connected Similarity Graph Γ from training data X .

Γ •x1

•x2

•x3

W12

W23WWWWWWW

W13

/////////

Let hxj

∆= 1PN1

`=1 Wj`

(Wj1, · · · ,WjN1) ∈ R1×N1 , the degree distribution or

histogram of connectivities of node xj to all other nodes in Γ.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 17 / 28

Page 36: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching Set up

Add nodes corresponding to unlabeled points in Y . Only add edgesconnecting between X and Y with weights wj` = e−‖yj−x`‖2/ε2

.

Γ

•x1

•x2

•x3

•y1

•y2

WWWWWWW

/////////

w11

w13

�����������

w12

w22

w23

OOOOOOO

???????????????????

Let hyj

∆= 1PN1

`=1 wj`

(wj1, · · · ,wjN1) ∈ R1×N1 , the degree distribution or

histogram of connectivities of node yj to all x` nodes in Γ.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 17 / 28

Page 37: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching Set up

Γ differs from the original fully connected similarity graph Γ onZ = X ∪ Y only in the absence of the edges (yj , yk).

Γ

•x1

•x2

•x3

•y1

•y2

WWWWWWW

/////////

w11

w13

�����������

w12

w22

w23

OOOOOOO

??????????????????? �O�O�O�O�O�O�O�O�O

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 18 / 28

Page 38: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Node Connectivities Matching Set up

Matching Node Connectivities

Discriminate the histogram hyjfrom hxk

using various measures:

L2 measure: L2(hyj,hxk

) =√∑N1

`=1 |hyj(`)− hxk

(`)|2.

Jeffreys divergence:

dJ(hyj,hxk

) =∑N1

`=1

(hyj

(`) loghyj (`)

hxk(`) + hxk

(`) loghxk

(`)

hyj (`)

).

Hellinger distance: dH(hyj,hxk

) =∑N1

`=1

(√hyj

(`)−√

hxk(`))2

.

χ2 Statistics: χ2(hyj,hxk

) =∑N1

`=1

“hyj (`)−m(`)

”2

m(`) ,

where m(`) = 12

(hyj

(`) + hxk(`))

.

Earth Mover’s Distance See defn of EMD .

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 19 / 28

Page 39: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results

Outline

1 Introduction

2 Diffusion Framework

3 Node Connectivities Matching

4 Numerical Experiments and Results

5 Conclusion

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 20 / 28

Page 40: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Synthetic Data

Triangular Waveform Classification

Triangular Waveforms

Figure: Five samples of three triangular waveform classes.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 21 / 28

Page 41: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Synthetic Data

Triangular Waveform Classification

Triangular Waveform Data Generation

Three classes of signals generated via:

x (1)(j) = uh1(j) + (1− u)h2(j) + ε(j).

x (2)(j) = uh1(j) + (1− u)h3(j) + ε(j).

x (3)(j) = uh2(j) + (1− u)h3(j) + ε(j).

where

j = 1, · · · , 32.

h1(j) = max{6− |j − 7|, 0}; h2 = h1(j − 8); h3(j) = h1(j − 4).

u is a uniform random variable on interval (0, 1).

ε is a standard normal variate.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 21 / 28

Page 42: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Synthetic Data

Triangular Waveform Classification

Experimental Procedure

Generate 100 training signals/class and 1000 test signals/class.

Repeat the procedure 10 times to get average misclassification rates.

Figure: A 2D projection of the triangular waveform dataset.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 21 / 28

Page 43: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Synthetic Data

Triangular Waveform Classification

Results (The Bayes rate is ∼ 14%)

Misclassification rates (average over 10 simulations)NCM in Error rate (%)

L2 Distance 20.07Jeffreys Divergence 19.47Hellinger Distance 19.45χ2 Statistics 19.43EMD 16.43

Classification by Nearest Neighbor MethodError rate (%)

Diff Maps extended by GHME: 19.21Diff Maps computed on Z = X ∪ Y 18.05No Diff Maps (the original coordinates) 21.21

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 21 / 28

Page 44: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Hyperspectral Data

Hyperspectral Images of Natural Scenes

Each pixel is a vector of 43 reflectance values at various wavelengths.

Reference:D. L. Ruderman, Statistics of cone responses to natural images:implications for visual coding, J. Opt. Soc. Am., vol. 15, no. 8,pp. 2036–2045, August 1998.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 22 / 28

Page 45: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Hyperspectral Data

Recognition of Pixel Type

Data preparation

Extract a window around each pixel

1× 1

∈ R43

3× 3

∈ R9×43

5× 5

∈ R25×43

=⇒ the data point associated with a pixel is x ∈ R43, R9×43, or R25×43.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 23 / 28

Page 46: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Hyperspectral Data

Recognition of Pixel Type

Recognition of pixel types from given samples

Three selected regions input as seeds Recognition result

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 23 / 28

Page 47: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Hyperspectral Data

Quantitatively Controlled Recognition of Pixel Type

Data Preparation

Hand segment regions of leaves, rocks, trunks from different images.

Green: leaves; Blue: rocks; Red: trunks.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 24 / 28

Page 48: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Hyperspectral Data

Quantitatively Controlled Recognition of Pixel Type

Experimental Procedure

Three-class recognition problem: Leaf, Rock, and Trunk pixels.

Randomly select ≈ 400 pixels per class for training and ≈ 1200 pixelsper class for test.

Training and test data come from different images.

Repeat procedure 20 times to get average misclassification rates.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 24 / 28

Page 49: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Numerical Experiments and Results Hyperspectral Data

Quantitatively Controlled Recognition of Pixel Type

ResultsClassification via NCM

Misclassification rates (average over 20 simulations)1× 1 3× 3 5× 5

Measure Error (%)

L2 42.63Jeffreys 29.48Hellinger 27.13χ2 28.89EMD 46.03

Measure Error (%)

L2 20.00Jeffreys 23.77Hellinger 21.92χ2 27.39EMD 30.76

Measure Error (%)

L2 20.06Jeffreys 21.76Hellinger 20.00χ2 20.85EMD 22.93

Classification via nearest neighbor in

Measure Error (%)

L2 dist. 31.83Diff. dist. 24.85

Measure Error (%)

L2 dist. 31.26Diff. dist. 57.23

Measure Error (%)

L2 dist. 30.05Diff. dist. 50.92

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 24 / 28

Page 50: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Conclusion

Outline

1 Introduction

2 Diffusion Framework

3 Node Connectivities Matching

4 Numerical Experiments and Results

5 Conclusion

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 25 / 28

Page 51: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

Conclusion

Summary

Node Connectivity Matching (NCM)

Is derived from diffusion distance;

Bypasses computation of eigenvalues/eigenvectors of diffusionoperator;

Avoids out-of-sample extension.

=⇒ Admits less error !

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 26 / 28

Page 52: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

References

References for Diffusion Distance and Diffusion Framework:

R. R. Coifman, S. Lafon, Diffusion maps, Appl. Comput.Harmon. Anal., 21:5–30, July 2006.

S. Lafon, Y. Keller, R.R. Coifman, Data fusion and multicuedata matching by diffusion maps, IEEE Trans. Pattern Anal. MachineIntell., 28(11):1784–1797, 2006.

C. Fowlkes, S. Belongie, F. Chung, J. Malik, Spectralgrouping using the Nystrom method, IEEE Trans. Pattern Anal.Machine Intell., 26(2):214–225, 2004.

L. Lieu, N. Saito, High-dimensional pattern recognition usinglow-dimensional embedding and Earth Mover’s Distance, submittedfor publication, 2009. Available athttp://www.math.ucdavis.edu/~saito/publications/.

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 27 / 28

Page 53: Signal Classification by Matching Node Connectivitiessaito/confs/SSP09/naoki... · 2009. 9. 15. · Signal Classi cation by Matching Node Connectivities1 Linh Lieu and Naoki Saito

References

Definition of Earth Mover’s Distance (EMD)

Suppose p = (p1, · · · , pn1 ) and q = (q1, · · · , qn2 ) are two histograms or two discretedistributions. The Earth Mover’s Distance (EMD) is defined by

EMD(p, q)∆=

Pi,j gijdijP

i,j gij

where

dij , i = 1, · · · , n1, j = 1, · · · , n2: ground distance; the dissimilarity between bins iand j ; the cost of moving one unit of feature in the feature space between the ithand jth feature,

gij ≥ 0, i = 1, · · · , n1, j = 1, · · · , n2: the optimal flow between two histograms

that minimizes the total costP

i,j gijdij , subject to the following constraints

B∑

i gij ≤ qj ,B∑

j gij ≤ pi ,B∑

ij gij = min{∑

i pi ,∑

j qj}

Y. Rubner, C. Tomasi, L. J. Guibas, The Earth Mover’s Distance as a Metric for Image Retrieval, International Journal of

Computer Vision, 40(2): 99–121, 2000.

back to prev slide

[email protected] (UCD Math Dept.) Matching Node Connectivities SSP09 28 / 28