icdm'07 1 depth-based novelty detection yixin chen dept. of computer and information science...

Post on 15-Jan-2016

217 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

ICDM'07 1

Depth-Based Novelty Detection

Yixin ChenDept. of Computer and Information ScienceUniversity of Mississippihttp://www.cs.olemiss.edu/~ychen

Joint work with Henry Bart, Xin Dang, and Hanxiang Peng

ICDM'07 2

Outline

Novelty detectionMotivationsKernelized spatial depth (KSD)Bounds on the false alarm probabilityEmpirical studiesDiscussions

ICDM'07 3

Outlier Detection

Missing label problem

One-class learning

ICDM'07 4

A Simple Outlier Detector

1-d example

Sensitivity

Threshold

Structure of the data

X

mean

median

X

X

X

?

ICDM'07 5

Median

The sign function

Median is

ICDM'07 6

Spatial Median

The spatial sign function

The spatial median is

ICDM'07 7

Spatial Depth

Spatial Depth

Sample version

The expectation of the unit vector starting from x

ICDM'07 8

Spatial Depth and Outlier Detection

outlier

ICDM'07 9

Example: Half-Moon Data

FAR = 70%

ICDM'07 10

Example: Ring Data

FAR = 100%

ICDM'07 11

Kernelized Spatial Depth (KSD)

σ→∞, KSD converges to SDσ→0, KSD → 0.293

ICDM'07 12

Example: Half-Moon Data

0.2495

ICDM'07 13

Example: Ring Data

0.2651

ICDM'07 14

KSD Outlier Detector

outliers

normal observations

b is margin

How should we decide the threshold t?

ICDM'07 15

Threshold Selection

Largest threshold such that upper bound on FAP ≤ desired level

ICDM'07 16

Bounds on the False Alarm Probability

A training set bound

A test set bound

ICDM'07 17

Empirical Study 110 species under the order Cypriniforms 989 specimens from Tulane University Museum of Natural History

ICDM'07 18

Empirical Study 1

MaskingEffect

ICDM'07 19

Empirical Study 2

ICDM'07 20

Discussions

KSD outlier detection and density based approaches

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

observations

kern

el s

pa

tial d

ep

th

0 2 4 6 8 10 120

0.05

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

observations

est

ima

ted

pro

ba

bili

ty d

en

sity

ICDM'07 21

Acknowledgment

Kory P. Northrop, Tulane UniversityHuimin Chen, University of New OrleansUniversity of MississippiNational Science Foundation

top related