outlier selection and one class classification by jeroen janssens

67
. . Outlier Selection and One-Class Classification . Jeroen Janssens @jeroenhjanssens

Upload: hakka-labs

Post on 27-Jan-2015

121 views

Category:

Technology


1 download

DESCRIPTION

I present a novel algorithm called Stochastic Outlier Selection (SOS). The SOS algorithm computes for each data point an outlier probability. These probabilities are more intuitive than the unbounded outlier scores computed by existing outlier-selection algorithms. I have evaluated SOS on a variety of real-world and synthetic datasets, and compared it to four state-of-the-art outlier-selection algorithms. The results show that SOS has a superior performance while being more robust to data perturbations and parameter settings.

TRANSCRIPT

Page 1: Outlier Selection and One Class Classification by Jeroen Janssens

...

Outlier Selection andOne-Class Classification

.

Jeroen Janssens @jeroenhjanssens

Page 2: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Overview

• Anomalies and outliers

• Stochastic Outlier Selection

• Experiments and results

Outlier Selection and One-Class Classification Jeroen Janssens

Page 3: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Anomalies and outliers

Outlier Selection and One-Class Classification Jeroen Janssens

Page 4: Outlier Selection and One Class Classification by Jeroen Janssens

..

..

Page 5: Outlier Selection and One Class Classification by Jeroen Janssens

.

.

.

.

Page 6: Outlier Selection and One Class Classification by Jeroen Janssens

.

..

.

Page 7: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Definition (Anomaly)

An anomaly is an observation or event that deviates qualitatively from what isconsidered to be normal, according to a domain expert.

Outlier Selection and One-Class Classification Jeroen Janssens

Page 8: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Detecting anomalies is important

• Expensive

• Dangerous

• Mess up your model

Outlier Selection and One-Class Classification Jeroen Janssens

Page 9: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Detecting anomalies is important

• Expensive

• Dangerous

• Mess up your model

Outlier Selection and One-Class Classification Jeroen Janssens

Page 10: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Human anomaly detection may suffer from

• Fatigue

• Information overload

• Emotional bias

Outlier Selection and One-Class Classification Jeroen Janssens

Page 11: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Computers work with numbers

.

....6

.8

.10

. 4.

6

.

8

.

10

.

width

.

height

.

. ..apple

. ..orange.

Visualisation

.

width height label8.4 7.3 apple6.7 7.1 orange8.0 6.8 apple7.4 7.2 apple9.6 9.2 orange .

….

….

….

Data points

..

Observations

Outlier Selection and One-Class Classification Jeroen Janssens

Page 12: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Computers work with numbers

.

....6

.8

.10

. 4.

6

.

8

.

10

.

width

.

height

.

. ..apple

. ..orange.

Visualisation

.

width height label8.4 7.3 apple6.7 7.1 orange8.0 6.8 apple7.4 7.2 apple9.6 9.2 orange .

….

….

….

Data points

..

Observations

Outlier Selection and One-Class Classification Jeroen Janssens

Page 13: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Computers work with numbers

.....6

.8

.10

. 4.

6

.

8

.

10

.

width

.

height

.

. ..apple

. ..orange.

Visualisation

.

width height label8.4 7.3 apple6.7 7.1 orange8.0 6.8 apple7.4 7.2 apple9.6 9.2 orange .

….

….

….

Data points

..

Observations

Outlier Selection and One-Class Classification Jeroen Janssens

Page 14: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

From anomaly to outlier

.....

speed over ground

.

rateofturn

.

. ..anomalous vessel

. ..normal vessel

Outlier Selection and One-Class Classification Jeroen Janssens

Page 15: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Definition (Outlier)

An outlier is a data point that deviates quantitatively from the majority of thedata points, according to an outlier-selection algorithm.

Outlier Selection and One-Class Classification Jeroen Janssens

Page 16: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Three standard deviations

.....−3σ

.−2σ

.−1σ

.1σ

.2σ

.3σ

...

x

.

. ..outlier

. ..inlier

..

.. 3σ..

Outlier Selection and One-Class Classification Jeroen Janssens

Page 17: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

..

Euler diagram

.

Set legend

.

Set notation

.

real-world observations

.

X.X.(a)

.

A

.

X

.

(b)

.

labeled by expert as anomalous

.

labeled by expert as normal

.

A

.

X ∩A

.

(c)

.

data points

.

unrecorded

.

D

.

X ∩D

.

A

.

D

.

X

.(d) .

anomalies represented as data points

. normalities represented as data points.

CA = D ∩A. CN = D ∩A.A . D.

X.

(e)

.

classified by algorithm as an outlier

.

classified by algorithm as an inlier

.

CO

.

CI = D ∩ CO

.

A

.

D

.

CO

.

X

.

(f )

.

hits

.

misses

.

false alarms

.

correct rejects

.

H= CA ∩ CO

.

FA = CN ∩ CO

.

M= CA ∩ CI

.

CR = CN ∩ CI

.

A

.

D

.

CO

.

X

Outlier Selection and One-Class Classification Jeroen Janssens

Page 18: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

..

Euler diagram

.

Set legend

.

Set notation

.

real-world observations

.

X.X.(a) .

A

.

X

.

(b)

.

labeled by expert as anomalous

.

labeled by expert as normal

.

A

.

X ∩A

.

(c)

.

data points

.

unrecorded

.

D

.

X ∩D

.

A

.

D

.

X

.(d) .

anomalies represented as data points

. normalities represented as data points.

CA = D ∩A. CN = D ∩A.A . D.

X.

(e)

.

classified by algorithm as an outlier

.

classified by algorithm as an inlier

.

CO

.

CI = D ∩ CO

.

A

.

D

.

CO

.

X

.

(f )

.

hits

.

misses

.

false alarms

.

correct rejects

.

H= CA ∩ CO

.

FA = CN ∩ CO

.

M= CA ∩ CI

.

CR = CN ∩ CI

.

A

.

D

.

CO

.

X

Outlier Selection and One-Class Classification Jeroen Janssens

Page 19: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

..

Euler diagram

.

Set legend

.

Set notation

.

real-world observations

.

X.X.(a) .

A

.

X

.

(b)

.

labeled by expert as anomalous

.

labeled by expert as normal

.

A

.

X ∩A

.

(c)

.

data points

.

unrecorded

.

D

.

X ∩D

.

A

.

D

.

X

.(d) .

anomalies represented as data points

. normalities represented as data points.

CA = D ∩A. CN = D ∩A.A . D.

X.

(e)

.

classified by algorithm as an outlier

.

classified by algorithm as an inlier

.

CO

.

CI = D ∩ CO

.

A

.

D

.

CO

.

X

.

(f )

.

hits

.

misses

.

false alarms

.

correct rejects

.

H= CA ∩ CO

.

FA = CN ∩ CO

.

M= CA ∩ CI

.

CR = CN ∩ CI

.

A

.

D

.

CO

.

X

Outlier Selection and One-Class Classification Jeroen Janssens

Page 20: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

..

Euler diagram

.

Set legend

.

Set notation

.

real-world observations

.

X.X.(a) .

A

.

X

.

(b)

.

labeled by expert as anomalous

.

labeled by expert as normal

.

A

.

X ∩A

.

(c)

.

data points

.

unrecorded

.

D

.

X ∩D

.

A

.

D

.

X

.(d) .

anomalies represented as data points

. normalities represented as data points.

CA = D ∩A. CN = D ∩A.A . D.

X

.

(e)

.

classified by algorithm as an outlier

.

classified by algorithm as an inlier

.

CO

.

CI = D ∩ CO

.

A

.

D

.

CO

.

X

.

(f )

.

hits

.

misses

.

false alarms

.

correct rejects

.

H= CA ∩ CO

.

FA = CN ∩ CO

.

M= CA ∩ CI

.

CR = CN ∩ CI

.

A

.

D

.

CO

.

X

Outlier Selection and One-Class Classification Jeroen Janssens

Page 21: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

..

Euler diagram

.

Set legend

.

Set notation

.

real-world observations

.

X.X.(a) .

A

.

X

.

(b)

.

labeled by expert as anomalous

.

labeled by expert as normal

.

A

.

X ∩A

.

(c)

.

data points

.

unrecorded

.

D

.

X ∩D

.

A

.

D

.

X

.(d) .

anomalies represented as data points

. normalities represented as data points.

CA = D ∩A. CN = D ∩A.A . D.

X.

(e)

.

classified by algorithm as an outlier

.

classified by algorithm as an inlier

.

CO

.

CI = D ∩ CO

.

A

.

D

.

CO

.

X

.

(f )

.

hits

.

misses

.

false alarms

.

correct rejects

.

H= CA ∩ CO

.

FA = CN ∩ CO

.

M= CA ∩ CI

.

CR = CN ∩ CI

.

A

.

D

.

CO

.

X

Outlier Selection and One-Class Classification Jeroen Janssens

Page 22: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

..

Euler diagram

.

Set legend

.

Set notation

.

real-world observations

.

X.X.(a) .

A

.

X

.

(b)

.

labeled by expert as anomalous

.

labeled by expert as normal

.

A

.

X ∩A

.

(c)

.

data points

.

unrecorded

.

D

.

X ∩D

.

A

.

D

.

X

.(d) .

anomalies represented as data points

. normalities represented as data points.

CA = D ∩A. CN = D ∩A.A . D.

X.

(e)

.

classified by algorithm as an outlier

.

classified by algorithm as an inlier

.

CO

.

CI = D ∩ CO

.

A

.

D

.

CO

.

X

.

(f )

.

hits

.

misses

.

false alarms

.

correct rejects

.

H= CA ∩ CO

.

FA = CN ∩ CO

.

M= CA ∩ CI

.

CR = CN ∩ CI

.

A

.

D

.

CO

.

X

Outlier Selection and One-Class Classification Jeroen Janssens

Page 23: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Confusion matrix

..

..hitHi

..false alarmFA

..missMi

..correct rejectCR

.

Anomaly (CA)

.

Normality (CN)

.Outlier(

CO)

.Inlier(

CI)

.

Expert labels the observation as a(n)

.Algo

rithm

classi

esthedatapo

intasa

n

Outlier Selection and One-Class Classification Jeroen Janssens

Page 24: Outlier Selection and One Class Classification by Jeroen Janssens

...

....

Data set

.

. ..data point

..

....

Labels by the expert

.

. ..anomaly. ..normality

.. .....

....

Classi cations by the algorithm

.

. ..inlier. ..outlier

..

....

Outcome

.

. ..hit. ..false alarm. ..miss. ..correct reject

Page 25: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Stochastic Outlier Selection

Outlier Selection and One-Class Classification Jeroen Janssens

Page 26: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Stochastic Outlier Selection

• Unsupervised outlier selection algorithm

• Employs concept of affinity

• Computes outlier probabilities

• One parameter: perplexity

• Inspired by t-SNE

Outlier Selection and One-Class Classification Jeroen Janssens

Page 27: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

A data point is selected as an outlier when all the other data pointshave insufficient affinity with it.

Outlier Selection and One-Class Classification Jeroen Janssens

Page 28: Outlier Selection and One Class Classification by Jeroen Janssens

..

Page 29: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

From input to output

...

..1. 2.1.

2

.

3

.

4

.

5

.

6

...X

.. .. 0.2

.

4

.

6

.

8

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...D

.. .. 0.2

.

4

.

6

.

8

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...A

.. .. 0.0.2

.

0.4

.

0.6

.

0.8

.

1

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

.. .. 0.

0.1

.

0.2

.

0.3

....

..1.1.

2

.

3

.

4

.

5

.

6

...Φ

.. .. 0.0.2

.

0.4

.

0.6

.

0.8

.

1

...

4.2.1

.

4.2.2

.

4.2.3

.

4.2.4 – 6

.

n

.m

.

Subsections:

.n

.n

.n

.1

Outlier Selection and One-Class Classification Jeroen Janssens

Page 30: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Demo: SOS on thecommand-line

(see http://sos.jeroenjanssens.com)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 31: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

From feature matrix to dissimilarity matrix

...........

d2,6 ≡ d6,2 =5.056

.

x1

.

x2

.

x3

.

x4

.

x5

.

x6

.0.

1.

2.

3.

4.

5.

6.

7.

8.

9.0 .

1

.

2

.

3

.

4

.

rst feature x1

.

second

feature

x 2

.

. ..data point

..

..1. 2.1.

2

.

3

.

4

.

5

.

6

...X

.. .. 0.2

.

4

.

6

.

8

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...D

.. .. 0.2

.

4

.

6

.

8

...

Equation 2.1

dij =

¿ÁÁÀ

m

∑k=1(xjk − xik)2

Outlier Selection and One-Class Classification Jeroen Janssens

Page 32: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

From feature matrix to dissimilarity matrix

...........

d2,6 ≡ d6,2 =5.056

.

x1

.

x2

.

x3

.

x4

.

x5

.

x6

.0.

1.

2.

3.

4.

5.

6.

7.

8.

9.0 .

1

.

2

.

3

.

4

.

rst feature x1

.

second

feature

x 2

.

. ..data point

..

..1. 2.1.

2

.

3

.

4

.

5

.

6

...X

.. .. 0.2

.

4

.

6

.

8

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...D

.. .. 0.2

.

4

.

6

.

8

...

Equation 2.1

dij =

¿ÁÁÀ

m

∑k=1(xjk − xik)2

Outlier Selection and One-Class Classification Jeroen Janssens

Page 33: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Affinity between data points

.....0.

2.

4.

6.

8.

10.0 .

0.2

.

0.4

.

0.6

.

0.8

.

1

.

dissimilarity dij

.

affinity

a ij

.

. ..σ 2i = 0.1

. ..σ 2i = 1

. ..σ 2i = 10

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...D

.. .. 0.2

.

4

.

6

.

8

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...A

.. .. 0.0.2

.

0.4

.

0.6

.

0.8

.

1

...

Equation 4.2

aij =⎧⎪⎪⎨⎪⎪⎩

exp (−d 2ij / 2σ 2

i ) if i ≠ j0 if i = j

Outlier Selection and One-Class Classification Jeroen Janssens

Page 34: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Affinity between data points

.....0.

2.

4.

6.

8.

10.0 .

0.2

.

0.4

.

0.6

.

0.8

.

1

.

dissimilarity dij

.

affinity

a ij

.

. ..σ 2i = 0.1

. ..σ 2i = 1

. ..σ 2i = 10

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...D

.. .. 0.2

.

4

.

6

.

8

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...A

.. .. 0.0.2

.

0.4

.

0.6

.

0.8

.

1

...

Equation 4.2

aij =⎧⎪⎪⎨⎪⎪⎩

exp (−d 2ij / 2σ 2

i ) if i ≠ j0 if i = j

Outlier Selection and One-Class Classification Jeroen Janssens

Page 35: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Smooth neighborhoods

...........

σ 26

.

x1

.

x2

.

x3

.

x4

.

x5

.

x6

.−1.

0.

1.

2.

3.

4.

5.

6.

7.

8.

9.−1 .

0

.

1

.

2

.

3

.

4

.

5

.

6

.

rst feature x1

.

second

feature

x 2

.

. ..σ 21. ..σ 2

2. ..σ 2

3. ..σ 2

4. ..σ 2

5. ..σ 2

6

Outlier Selection and One-Class Classification Jeroen Janssens

Page 36: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

From affinity to binding probability

...

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...A

.. .. 0.0.2

.

0.4

.

0.6

.

0.8

.

1

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

.. .. 0.

0.1

.

0.2

.

0.3

...

Equation 4.5 / 4.6

bij = p (i→ j ∈ EG)∝ aij

bij =aij

∑nk=1 aik

Outlier Selection and One-Class Classification Jeroen Janssens

Page 37: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

From affinity to binding probability

...

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...A

.. .. 0.0.2

.

0.4

.

0.6

.

0.8

.

1

....

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

.. .. 0.

0.1

.

0.2

.

0.3

...

Equation 4.5 / 4.6

bij = p (i→ j ∈ EG)∝ aij

bij =aij

∑nk=1 aik

Outlier Selection and One-Class Classification Jeroen Janssens

Page 38: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Binding probabilities

..

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

Outlier Selection and One-Class Classification Jeroen Janssens

Page 39: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Binding probabilities

.

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

Outlier Selection and One-Class Classification Jeroen Janssens

Page 40: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Binding probabilities

.

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

Outlier Selection and One-Class Classification Jeroen Janssens

Page 41: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Binding probabilities

.

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

.

..

(a)

.v1.v2.

v3

.

v4

.

v5

.

v6

. .04.

.21

..24

.

.25

.

.25

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

.. .. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

...

(b)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.10

.

.21

.

.29

.

.30

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.

0.1

.

0.2

.

0.3

..

.

(c)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.10

.

.18

.

.22

.

.23

.

.24

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.05

.

0.1

.

0.15

.

0.2

.

0.25

..

.

(d)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

.04

.

.04

.

.04

.

.05

.

.05

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

..

...

..

.. 0.0.01

.

0.02

.

0.03

.

0.04

.

0.05

..

Outlier Selection and One-Class Classification Jeroen Janssens

Page 42: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Stochastic Neighbor Graph

G = (V,EG)

p(G) =∏i→j ∈EG

bij .

CO ∣G = {xi ∈ X ∣ deg−G(vi) = 0}= {xi ∈ X ∣ ∄vj ∈ V ∶ j→ i ∈ EG}= {xi ∈ X ∣ ∀vj ∈ V ∶ j→ i ∉ EG}

Outlier Selection and One-Class Classification Jeroen Janssens

Page 43: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Stochastic Neighbor Graph

G = (V,EG)

p(G) =∏i→j ∈EG

bij .

CO ∣G = {xi ∈ X ∣ deg−G(vi) = 0}= {xi ∈ X ∣ ∄vj ∈ V ∶ j→ i ∈ EG}= {xi ∈ X ∣ ∀vj ∈ V ∶ j→ i ∉ EG}

Outlier Selection and One-Class Classification Jeroen Janssens

Page 44: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Stochastic Neighbor Graph

G = (V,EG)

p(G) =∏i→j ∈EG

bij .

CO ∣G = {xi ∈ X ∣ deg−G(vi) = 0}= {xi ∈ X ∣ ∄vj ∈ V ∶ j→ i ∈ EG}= {xi ∈ X ∣ ∀vj ∈ V ∶ j→ i ∉ EG}

Outlier Selection and One-Class Classification Jeroen Janssens

Page 45: Outlier Selection and One Class Classification by Jeroen Janssens

..

(Ga)

.v1.v2.

v3

.

v4

.

v5

.

v6

.

p(Ga) = 3.931 ⋅ 10−4

.CO∣Ga = {x1, x4, x6}

.

(Gb)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

p(Gb) = 4.562 ⋅ 10−5

.

CO∣Gb = {x5, x6}

.

(Gc)

.

v1

.

v2

.

v3

.

v4

.

v5

.

v6

.

p(Gc) = 5.950 ⋅ 10−7

.

CO∣Gc = {x1, x3}

Page 46: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Set of all SNGs

.....0 .

2

.

4

.

6

.

8

.

10

.

⋅10−4

..

Ga

..

Gb

..

Gc

.G

.

prob

abilitymass

.. ..0 .

0.2

.

0.4

.

0.6

.

0.8

.

1

.G

.cumulativeprob

abilitymass

.

. ..h = 4.0

. ..h = 4.5

. ..h = 5.0

Outlier Selection and One-Class Classification Jeroen Janssens

Page 47: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Approximating outlier probabilities by sampling SNGs

... ..100.

101.

102.

103.

104.

105.

106.0 .

0.2

.

0.4

.

0.6

.

0.8

.

1

.

sampling iteration

.

outlierprob

ability

.

. ..x1

. ..x2

. ..x3

. ..x4

. ..x5

. ..x6

p(xi ∈ CO) = limS→∞

1

S

S

∑s=1

I{xi ∈ CO ∣G(s)} , G(s) ∼ P(G)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 48: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Demo: Sampling SNGs inCoffeeScript and D3(see http://sos.jeroenjanssens.com)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 49: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Computing outlier probabilities through marginalisation

p(xi ∈ CO) =∑G∈G

I{xi ∈ CO ∣G} ⋅ p(G)

=∑G∈G

I{xi ∈ CO ∣G} ⋅∏q→r ∈EG

bqr .

∣G∣ = (n − 1)n

Outlier Selection and One-Class Classification Jeroen Janssens

Page 50: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Computing outlier probabilities in closed form

...........

x1

.

x2

.

x3

.

x4

.

x5

.

x6

.0.

1.

2.

3.

4.

5.

6.

7.

8.

9.0 .

1

.

2

.

3

.

4

.

rst feature x1

.

second

feature

x 2

.

. ..data point

..

..1. 2. 3. 4. 5. 6.1.

2

.

3

.

4

.

5

.

6

...B

.... 0.

0.1

.

0.2

.

0.3

.. ..

..1.1.

2

.

3

.

4

.

5

.

6

...Φ

.... 0.0.2

.

0.4

.

0.6

.

0.8

.

1

.. .

Equation 4.14

p(xi ∈ CO) =∏j≠i(1 − bji)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 51: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

xi ∈ CO ∣G ⇐⇒ deg−G(vi) = 0 (1)

p(xi ∈ CO) = EG [I{deg−G(vi) = 0}] (2)

p(xi ∈ CO) = EG [∏j≠i

I{ j→ i ∉ EG}] (3)

p(xi ∈ CO) = EG [∏j≠i(1 − I{ j→ i ∈ EG})] (4)

p(xi ∈ CO) =∏j≠i(1 −EG [I{ j→ i ∈ EG}]) (5)

p(xi ∈ CO) =∏j≠i(1 − p( j→ i ∈ EG)) (6)

p(xi ∈ CO) =∏j≠i(1 − bji) (7)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 52: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Selecting outliers

...........

x1

.

x2

.

x3

.

x4

.

x5

.

x6

.−1.

0.

1.

2.

3.

4.

5.

6.

7.

8.

9.−1 .

0

.

1

.

2

.

3

.

4

.

5

.

rst feature x1

.

second

feature

x 2

.

. ..outlier

. ..inlier

f(x) =⎧⎪⎪⎨⎪⎪⎩

outlier if p(x ∈ CO) > θ,inlier if p(x ∈ CO) ≤ θ.

Outlier Selection and One-Class Classification Jeroen Janssens

Page 53: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Adaptive variances via the perplexity parameter

... ..10−1

.100.

101.

102.

1.

2

.

3

.

4

.

5

.

variance σ 2i

.

perplexity

h(b i)

.. ..5

.10

.15

......

variance σ 2i

..

. ..h = 4.5

. ..x1

. ..x2

. ..x3

. ..x4

. ..x5

. ..x6

h(bi) = 2H(bi) , H(bi) = −n

∑j=1j≠i

bij log2 (bij)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 54: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Continuous binary search

...

.............

5

.

10

.

15

..

variance

σ2 i

.

. ..x1

. ..x2

. ..x3

. ..x4

. ..x5

. ..x6..

..

0

.

1

.

2

.

3

.

4

.

5

.

6

.

7

.

8

.

9

.

10

.

4.2

.

4.4

.

4.6

.

4.8

.

binary search iteration

.

perplexity

h(b i)

Outlier Selection and One-Class Classification Jeroen Janssens

Page 55: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Perplexity influences outlier probabilities

.....0.

1.

2.

3.

4.

5.

6.0 .

0.2

.

0.4

.

0.6

.

0.8

.

1

.

perplexity h

.

outlierprob

ability

.

. ..x1

. ..x2

. ..x3

. ..x4

. ..x5

. ..x6

Outlier Selection and One-Class Classification Jeroen Janssens

Page 56: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Experiments and results

Outlier Selection and One-Class Classification Jeroen Janssens

Page 57: Outlier Selection and One Class Classification by Jeroen Janssens

..

..

......

...

......

...

......

...

......

...

......

...

.

Banana

.

Densities

.

Ring

.

KNNDD(k=

10)

.LO

F(k=10)

.

LOCI

.

SOS(h=10)

.

LSOD

.

A

.

B

.

C

.

D

.

E

.

H

.G .

F

.

I

.

J

Page 58: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

One-class datasets

..

...

..4.

5.

6.

7.

8.0 .

1

.

2

.

3

.

Sepal length

.

Petalw

idth

.

DM = Iris ower data set

.

. ..C1 = Setosa

. ..C2 = Versicolor

. ..C3 = Virginica

......

D1

.

. ..CA = Versicolor ∪ Virginica

. ..CN = Setosa

......

D2

.

. ..CA = Setosa ∪ Virginica

. ..CN = Versicolor

.. ....

D3

.

. ..CA = Setosa ∪ Versicolor

. ..CN = Virginica

Outlier Selection and One-Class Classification Jeroen Janssens

Page 59: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Weighted AUC

..

.....0 .

0.5

.

1

..

outlierscore

.

. ..hit. ..false alarm. ..miss. ..correct reject

.CA

.CN

.

CO

.

CI

.

θ′.

.....0

.0.2

.0.4

.0.6

.0.8

.1

.0 .

0.2

.

0.4

.

0.6

.

0.8

.

1

.

false alarm rate

.

hitrate

.. ... 0.2.

0.4

.

0.6

.

0.8

.

1

...

θ′= 0.57

.

Outlier Selection and One-Class Classification Jeroen Janssens

Page 60: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Real-world datasets

...

..

Iris

.

Ecoli

.

Breast W. Original.

Delft Pump.

Waveform Generator.

Wine Recognition.

Colon Gene.

BioMed.

Vehicle Silhouettes.

Glass Identi cation.

Boston Housing.

Heart Disease.

Arrhythmia.

Haberman’s Survival.

Breast W. New.

Liver Disorders.

SPECTF Heart.

Hepatitis.

0

.

0.5

.

0.6

.

0.7

.

0.8

.

0.9

.

1

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

..

..

..

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

. ..

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

..

..

..

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

..

..

..

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

..

..

..

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

..

..

..

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

..

weigh

tedAU

C

.

. ..SOS . ..KNNDD . ..LOF . ..LOCI . ..LSOD

Outlier Selection and One-Class Classification Jeroen Janssens

Page 61: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Synthetic datasets

Parameter λ

Data set Determines λstart step size λend

(a) Radius of ring 5 0.1 2(b) Cardinality of cluster and ring 100 5 5(c) Distance between clusters 4 0.1 0(d) Cardinality of one cluster and ring 0 5 45(e) Density of one cluster and ring 1 0.05 0(f ) Radius of square ring 2 0.05 0.8(g) Radius of square ring 2 0.05 0.8

Outlier Selection and One-Class Classification Jeroen Janssens

Page 62: Outlier Selection and One Class Classification by Jeroen Janssens

..

..

..

(a)

.......

.......

......

..

(b)

.......

.......

......

..

(c)

.......

.......

......

..

(d)

.......

.......

......

..

(e)

.......

.......

......

..

(f)

.......

.......

......

..

(g)

.......

.......

......

.

λstart

. λinter

.

λend

.

..

...anom

aly.

..normality

Page 63: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Results on synthetic datasets

.. ...

..

2

.

2.5

.

3

.

3.5

.

4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(a)

....10 .20 .30 .40 .........

λ..

(b)

..

..0.1

.2 .3 .4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....λ

.

AUC

.

(c)

..

..

10

.

20

.

30

.

40

.........

λ

..

(d)

..

..

0

.

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.0.6 .0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(e)

..

..

0.8

.

1

.

1.2

.

1.4

.........λ

..

(f )

..

..0.8 .1 .1.2 .1.4 .0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(g)

.

. ..SOS

. ..KNNDD

. ..LOF

. ..LOCI

. ..LSOD

.

...

..

2

.

2.5

.

3

.

3.5

.

4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(a)

....10 .20 .30 .40 .........

λ..

(b)

..

..0.1

.2 .3 .4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....λ

.

AUC

.

(c)

..

..

10

.

20

.

30

.

40

.........

λ

..

(d)

..

..

0

.

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.0.6 .0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(e)

..

..

0.8

.

1

.

1.2

.

1.4

.........λ

..

(f )

..

..0.8 .1 .1.2 .1.4 .0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(g)

.

. ..SOS

. ..KNNDD

. ..LOF

. ..LOCI

. ..LSOD

Outlier Selection and One-Class Classification Jeroen Janssens

Page 64: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Results on synthetic datasets

.

. ...

..

2

.

2.5

.

3

.

3.5

.

4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(a)

....10 .20 .30 .40 .........

λ..

(b)

..

..0.1

.2 .3 .4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....λ

.

AUC

.

(c)

..

..

10

.

20

.

30

.

40

.........

λ

..

(d)

..

..

0

.

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.0.6 .0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(e)

..

..

0.8

.

1

.

1.2

.

1.4

.........λ

..

(f )

..

..0.8 .1 .1.2 .1.4 .0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(g)

.

. ..SOS

. ..KNNDD

. ..LOF

. ..LOCI

. ..LSOD

.

...

..

2

.

2.5

.

3

.

3.5

.

4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(a)

....10 .20 .30 .40 .........

λ..

(b)

..

..0.1

.2 .3 .4

.

0.6

.

0.7

.

0.8

.

0.9

.

1

....λ

.

AUC

.

(c)

..

..

10

.

20

.

30

.

40

.........

λ

..

(d)

..

..

0

.

0.1

.

0.2

.

0.3

.

0.4

.

0.5

.0.6 .0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(e)

..

..

0.8

.

1

.

1.2

.

1.4

.........λ

..

(f )

..

..0.8 .1 .1.2 .1.4 .0.6

.

0.7

.

0.8

.

0.9

.

1

....

λ

.

AUC

.

(g)

.

. ..SOS

. ..KNNDD

. ..LOF

. ..LOCI

. ..LSOD

Outlier Selection and One-Class Classification Jeroen Janssens

Page 65: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

SOS performs significantly better

..1

.2

.3

.4

.5

.

cd (p < .05)

.

SOS

.

LOF

.

LOCI

.

KNNDD

.

LSOD

.

1

.

2

.

3

.

4

.

5

.

cd (p < .01)

.

SOS

.

LOCI

.

LSOD

.

LOF

.

KNNDD

Outlier Selection and One-Class Classification Jeroen Janssens

Page 66: Outlier Selection and One Class Classification by Jeroen Janssens

. . . . . . . . . . . .Anomalies and outliers

. . . . . . . . . . . . . . . . . . . . . . .Stochastic Outlier Selection

. . . . . . . . . . .Experiments and results

Conclusion

• Outlier selection can support the detection of anomalies

• SOS is an intuitive and probabilistic algorithm to select outliers

• SOS has a very good performance

• No free lunch

Outlier Selection and One-Class Classification Jeroen Janssens

Page 67: Outlier Selection and One Class Classification by Jeroen Janssens

...

Outlier Selection andOne-Class Classification

.

Jeroen Janssens @jeroenhjanssens