learning target pattern-of-life for wide-area anomaly detection

59
Introduction Methodology Results Conclusions and Future Work Bibliography L EARNING TARGET PATTERN- OF - LIFE FOR WIDE - AREA ANOMALY DETECTION Tatiana López Guevara June 2015

Upload: zepolitat

Post on 14-Aug-2015

22 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

LEARNING TARGET PATTERN-OF-LIFE FOR

WIDE-AREA ANOMALY DETECTION

Tatiana López Guevara

June 2015

Page 2: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Participants

Supervisors

Dr. Rolf Baxter Dr. Neil Robertson

Page 3: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Contents

1 Introduction

2 Methodology

3 Results

4 Conclusions and Future Work

5 Bibliography

Page 4: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Section 1

Introduction

Page 5: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What is Anomaly Detection?

Chandola et al. [1]: "Patterns in data that do not conform to a welldefined notion of normal behaviour"

Well defined notion?

Same Size?

Same Type?

Page 6: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What is Anomaly Detection?

Chandola et al. [1]: "Patterns in data that do not conform to a welldefined notion of normal behaviour"

Well defined notion?

Same Size?

Same Type?

Page 7: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What is Anomaly Detection?

Chandola et al. [1]: "Patterns in data that do not conform to a welldefined notion of normal behaviour"

Well defined notion?

Same Size?

Same Type?

Page 8: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What is Anomaly Detection?

Chandola et al. [1]: "Patterns in data that do not conform to a welldefined notion of normal behaviour"

Well defined notion?

Same Size?

Same Type?

Page 9: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What is Anomaly Detection?

Hawkins et al. [2]: "An observation which deviates so much fromother observations as to arouse suspicions that it was generated bya different mechanism."

Page 10: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What is Anomaly Detection?

Hawkins et al. [2]: "An observation which deviates so much fromother observations as to arouse suspicions that it was generated bya different mechanism."

Page 11: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Pattern-of-Life

Learn preferred behaviour from target’s daily interaction with itsenvironment

Page 12: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Pattern-of-Life

Learn preferred behaviour from target’s daily interaction with itsenvironment

Page 13: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Pattern-of-Life

Learn preferred behaviour from target’s daily interaction with itsenvironment

Page 14: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Pattern-of-Life

Learn preferred behaviour from target’s daily interaction with itsenvironment

Page 15: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Wide-Area

Not limited to a single/fixed scenario

Page 16: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Wide-Area

Not limited to a single/fixed scenario

Page 17: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

What kind of behaviour?

Human movement⇒ Trajectories

Page 18: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Definitions

Anomaly Detection

Detect behaviour not represented by the model⇒ General indicator of an interesting event!

Page 19: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Our observation

What other information could be useful?

Periodic modulation that characterise human nature

Page 20: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Why is it useful?

POL as a prior

Enhanced Tracking

Personalized/Proactive Systems

Anomalies detected

Raise alarms ⇒ elderly/cognitive impaired people

Other domains

Change single target’s traces ⇒ ships/cars/pedestrians

Other types of human behaviour

Indoor high level activities

Page 21: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Why is it challenging?

POL characteristics : Must have1 Unsupervised on-line learning

2 Partially observed trajectories

3 No external dependencies

4 Few ad-hoc thresholds / Low False Positive Rate (FPR)

No prior work use time-dependent POL for anomaly detection!

Page 22: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Why is it challenging?

POL characteristics : Must have1 Unsupervised on-line learning

2 Partially observed trajectories

3 No external dependencies

4 Few ad-hoc thresholds / Low False Positive Rate (FPR)

No prior work use time-dependent POL for anomaly detection!

Page 23: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Summary of Objectives

Learn behaviour from movement

Single person’s GPS Tracks

Include temporal dependency

Time of the day

Day of the week

Detect Anomalies

Spatial

Spatio-Temporal

Page 24: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Section 2

Methodology

Page 25: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Hierarchical Model Learning

Temporal layerPreferred schedules

Spatial layerPreferred routes

Spati al Layer

Temporal Layer

Page 26: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Overview of Proposed Methodology

Update SpatialModel

SpatialAnomaly ?

TemporalAnomaly ?

Update TemporalModel

Point Anomaly Logger

Trajectory Point

Anomaly Processor

Temporal Layer

Spatial Layer

Anomaly Detection

Preprocessing

Page 27: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Spatial Layer

Page 28: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Spatial Layer: Model Learning

Adaptation of on-line method proposed by Piciarelli et al. [4]to work with wide-area data

1 2 3

4 5 6

c1

c2 c3

c4

c1

c4

c2 c3

c1

c2 c3

c4

c1

c2 c3

c4

c1

c2 c3

c4c5

c1

c2 c3

c4c5

c6

(Images adapted from [4]).

Page 29: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer

Two methods1 Kernel Density Estimation (KDE)

2 Conformal Anomaly Detection (CAD)

Page 30: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - Kernel Density Estimator (KDE)

KDE Definition

f̂ (x;h) = 1

n

n∑i=1

Kh(x−xi) (1)

Which Kernel?

Circular data⇒ von-Misses Kernel

Advantages

Non parametric

Parameter-light

Page 31: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - Kernel Density Estimator (KDE)

KDE Definition

f̂ (x;h) = 1

n

n∑i=1

Kh(x−xi) (1)

Which Kernel?

Circular data⇒ von-Misses Kernel

Advantages

Non parametric

Parameter-light

Page 32: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - Kernel Density Estimator (KDE)

KDE Definition

f̂ (x;h) = 1

n

n∑i=1

Kh(x−xi) (1)

Which Kernel?

Circular data⇒ von-Misses Kernel

Advantages

Non parametric

Parameter-light

Page 33: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer

Two methods1 Kernel Density Estimation (KDE)

2 Conformal Anomaly Detection (CAD)

Page 34: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - Conformal Anomaly Detector (CAD)

Proposed by Laxhammar et al. [3]

Advantages

Based on theory of confidence Interval

Completely on-line

Parameter-lightε is directly bounded to the FPR

Page 35: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - CAD - Method

Input

Previous observations: B = {~z1, ..,zn−1}

New observation:~zn

Output

Ratio of samples in B that are at least asdifferent as~zn.pzn

Nonconformity Measure NCM

Sum of the distance of the K− nearestneighbours

Page 36: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - CAD - Method

Input

Previous observations: B = {~z1, ..,zn−1}

New observation:~zn

Output

Ratio of samples in B that are at least asdifferent as~zn.pzn

Nonconformity Measure NCM

Sum of the distance of the K− nearestneighbours

Page 37: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - CAD - Method

Input

Previous observations: B = {~z1, ..,zn−1}

New observation:~zn

Output

Ratio of samples in B that are at least asdifferent as~zn.pzn

Nonconformity Measure NCM

Sum of the distance of the K− nearestneighbours

Page 38: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - CAD - Method

Input

Previous observations: B = {~z1, ..,zn−1}

New observation:~zn

Output

Ratio of samples in B that are at leastas different as~zn.pzn

Nonconformity Measure NCM

Sum of the distance of the K− nearestneighbours

Page 39: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - CAD - Method

Input

Previous observations: B = {~z1, ..,zn−1}

New observation:~zn

Output

Ratio of samples in B that are at leastas different as~zn.pzn

Nonconformity Measure NCM

Sum of the distance of the K− nearestneighbours

Page 40: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Anomaly Detection

Spatial

No cluster match found

Match to a low density cluster < thrT

Temporal - KDE Method

Low density regions: less than 95% of the total density

Temporal - CAD Method

Fraction less than parameter: pzn < ε

Page 41: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Section 3

Results

Page 42: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Datasets

Heriot-Watt Dataset

Period: 7 months Dates: Oct 2014 - Apr 2015

Page 43: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Qualitative Results - Spatial Layer

Zoom in: Most Transited Area

Page 44: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Qualitative Results - Spatial Layer

Zoom out: Overall view

Page 45: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Quantitative Results - Spatial Layer

Quantitative result of spatial anomalies detected by the Spatial layer

Page 46: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Qualitative Results - Temporal Layer

0/24 1 2345

6

78

91011121314

1516

17

18

192021

22 23

+

KDE CAD

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.05

0.1

0.15

0.2

0.25

0.3

0.35Track: 758, Cluster: 1839 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7Track: 758, Cluster: 1839 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2Track: 715, Cluster: 2139 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Track: 715, Cluster: 2139 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Page 47: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Qualitative Results - Temporal Layer

0/24 1 2345

6

78

91011121314

1516

17

18

192021

22 23

+

KDE CAD0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

0

0.05

0.1

0.15

0.2

0.25

0.3

0.35Track: 758, Cluster: 1839 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7Track: 758, Cluster: 1839 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2Track: 715, Cluster: 2139 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Track: 715, Cluster: 2139 [-1]

KDE CircularAnomaliesObservations

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

Page 48: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Quantitative Results - Temporal Layer

KDE CAD

Quantitative result of spatial anomalies detected by the Temporal layer

Page 49: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Datasets

Geolife Dataset

Microsoft Research

Area:75km2

Period:71 days

Dates:Feb 9 - Apr 27, 2009

Page 50: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Qualitative Results - Spatial Layer

GeoLife: Each color represents one cluster

Page 51: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Qualitative Results - Temporal Layer

KDE CAD

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 240

0.2

0.4

0.6

0.8

1

1.2

1.4Track: 405, Cluster: 239 [−1]

KDE CircularAnomaliesObservations

0 5 10 15 20 250

2

4

6

8NCM Track: 405, Cluster: 239 [−1]

Temporal Resolution

Alph

a0 5 10 15 20

0

0.5

1CAD Probability Track: 405, Cluster: 239 [−1]

Temporal Resolutionp

CAD method creates narrower normality regions (v 30m) than KDE

Page 52: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Section 4

Conclusions and Future Work

Page 53: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Conclusions

1 New hierarchical model incorporating time-dependence wasproposed

2 Two methods for modelling temporal information wereimplemented/compared

3 A CAD-NCM metric using circular distance for time wasproposed

4 KDE method showed an over-smoothing effect due to thebandwidth selection method.

5 Spatial and Spatio-Temporal anomalies quantitatively andqualitatively assessed against 2 datasets

Page 54: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Future Work

1 Proper way to forget!

2 Test other CAD-NCM’s: Entropy-based / Local Outlier Factor(LOF)

3 Efficient on-line Kernel Density Estimation

4 Anomaly prediction using Long Short-Term Memory Networks

Page 55: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Thank youAny questions ?

Page 56: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Section 5

Bibliography

Page 57: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Bibliography

Varun Chandola, Arindam Banerjee, and Vipin Kumar.

Anomaly detection.ACM Computing Surveys, 41(3):1–58, 2009.

Douglas M Hawkins.

Identification of outliers, volume 11.Springer, 1980.

Rikard Laxhammar and Goran Falkman.

Online learning and sequential anomaly detection in trajectories.IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(6):1158–1173, 2014.

C. Piciarelli and G. L. Foresti.

On-line trajectory clustering for anomalous events detection.Pattern Recognition Letters, 27:1835–1842, 2006.

Page 58: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Spatial Model Learning

Matching

Match Found?

Create Cluster Update Cluster

Exiting Cluster?

Near End? Split

no yes

yes

no

Concatenate Roots

Prune Dead Clusters

Merge Clusters

Model Learning: Left image show the cluster building process (Modified from [4]) executed every time a new trajectory pointis observed. Right image the maintenance process executed in batch. Developed in the VisionLab.

Distance Function

d(~zi,C) = minj

(dist(~zi,cj)p

σ2

)∀j ∈ 1, ..,M (2)

Page 59: Learning target Pattern-of-Life for wide-area Anomaly Detection

Introduction Methodology Results Conclusions and Future Work Bibliography

Temporal Layer - Kernel Density Estimator

KDE Definition

f̂ (x;h) = 1

n

n∑i=1

Kh(x−xi) (3)

KDE using von-Misses Kernel

f̂ (θ;v) = 1

n(2π)Ir(v)

n∑i=1

ev cos(θ−θi) (4)

0/24 1 2345

6

78

91011121314

1516

17

18

192021

22 23

+

0/24 1 2345

6

78

91011121314

1516

17

18

192021

22 23

+