"building anomaly detection for large scale analytics", yonatan ben shimon, anodot...

26
1 Building Anomaly Detection For Large Scale Analytics Yonatan Ben-Simhon, Anodot Architect 16 th May, 2016

Upload: dataconomy-media

Post on 16-Apr-2017

108 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

1

Building Anomaly Detection For Large Scale Analytics

Yonatan Ben-Simhon, Anodot Architect16th May, 2016

Page 2: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

2

Outline

What is anomaly detection?

Design principals for Anomaly Detection

Anomaly detection? Why do I need it?

Anomaly Detection Methods

The Anodot System

Page 3: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

3

What is Anomaly Detection?

Page 4: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

4

Find the Anomaly

Page 5: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

5

Anomaly Detection in Time Series Signals

Unexpected change of temporal pattern of one or more time series signals.

Page 6: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

6

Why Anomaly Detection?

Page 7: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

7

Detecting the Unknowns Saves Time + Money

Industrial IoTProactive Maintenance

Detecting issues in factories/machines

Web ServicesDetecting business incidents + unknown

business opportunities

Machine LearningClosing the “Machine Learning” loop

Tracking and detecting ”unknowns” not modeled during training

SecurityDetection of unknown breach/attack

patterns

Page 8: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

8

Detecting Business Incidents: Metric Driven Detection

Business

Business Generation: Leads, visitors, usage,

engagements

App: Performance, errors, usability

Infra utilization/state: Middleware, network, System

e.g., Purchases per product, Conversions per campaign…

Per Geo, user segment, page, browser, device…

Per class, method, feature…

Per host, database, switch…

Page 9: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

9

Detecting Business Incidents: Metric Driven DetectionDrop in # of visitors

Decrease in ad conversion on Android Price glitch – increase in purchases / decrease in revenue

Page 10: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

10

Setting alerts with thresholdsDashboards

Manual Detection of Business incidents

Page 11: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

11

Anomaly detection: Design Principals

Page 12: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

12

Anomaly Detection: Design Considerations

TimelinessReal time vs. Batch

Detection

Scale100’s vs. Millions

of metrics

Rate of changeAdaptive vs. Offline

learning

ConcisenessUnivariate vs.

Multivariate methods

Page 13: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

13

Timeliness: Real time vs. Batch Detection

Real time detection Batch detection

Online learning – cannot iterate over the data

More prone to False Positives

Scales more easily

Batch learning – can iterate over the data

Easier to reduce False Positives

Harder to scale

Page 14: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

14

Rate of change

Fast changes Slow changes

• Most common case• ”Closed” systems – e.g., airplanes,

large machinery

• Requires adaptive algorithms• Learn once and apply the model for

a long time

Page 15: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

15

Conciseness of Anomalies

Univariate Anomaly Detection Multivariate Anomaly Detection

• Learn normal model for each metric

• Anomaly detection at the metric level

• Easier to scale• Easier to model many types of

behaviors• Causes anomaly storms

• Learn single model for all metrics

• Anomaly detection of complete incident

• Hard to scale• Hard to interpret the anomaly• Often requires metric behaviour

to be homogeneous

Hybrid approach

• Learn normal model for each metric

• Combine anomalies to single incidents if metrics are related

• Scalable• Can combine multiple types of

metric behaviours

Page 16: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

16

Anomaly Detection Methods

Page 17: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

17

Unsupervised Anomaly Detection

General scheme

Step 1 Step 2 Step 3

Model the normal behavior of the metric(s) using a statistical model

Devise a statistical test to determine if samples are explained by the model.

Apply the test for each sample. Flag as anomaly if it does not pass the test

Page 18: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

18

Very Simple Model

1σ1σ

2σ2σ

3σ3σ

μ

99.7%

95.4%

68%

Assume normal behavior is the Normal distribution

Estimate the average, standard deviation over all samples

Test: any sample |x-average|> 3*standard deviation is abnormal

Page 19: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

19

A single model does not fit them all!

Smooth (stationary)

Irregular sampling

Multi Modal Sparse

Discrete “Step”

Page 20: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

20

Example Online Models/Algorithms

4

2

1

3

Simple Moving Average

Double/Triple exponential (Holt-

Winters)

Kalman Filters + ARIMA and variations

Single exponential forgetting

Page 21: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

21

Example: The importance of modeling seasonality

Single seasonal pattern

Page 22: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

22

Example Methods to detect seasonality

Finding maximums in Auto-correlation of signal

Computationally expensive

More robust to gaps

Finding maximum(s) in Fourier transform of signal

Challenging to detect low frequency seasons

Challenging to discover multiple seasons

Sensitive to missing data

Exhaustive search based on cost function

Computationally expensive

Robust to gaps

Challenging to discover multiple seasons

Page 23: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

23

Large scale anomaly detection – the Anodot system

Page 24: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

24

Automatic Anomaly Detection in five Steps: The Anodot Way

Metrics Collection – Universal, scale to millions

Normal behavior learning

Abnormal behavior learning

Behavioral Topology Learning

Feedback Based

Learning

1 2 3 4 5

Page 25: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

2525

Webinar

Page 26: "Building Anomaly Detection For Large Scale Analytics", Yonatan Ben Shimon, Anodot Architect

[email protected]

Thank you