anomaly detection in deep learning

10
Anomaly Detection in Deep Learning Adam Gibson Skymind - Reactive Meetup 2016 @ Google Tokyo

Upload: adam-gibson

Post on 11-Jan-2017

2.497 views

Category:

Data & Analytics


0 download

TRANSCRIPT

Page 1: Anomaly detection in deep learning

Anomaly Detection in Deep Learning

Adam Gibson Skymind - Reactive Meetup 2016 @ Google Tokyo

Page 2: Anomaly detection in deep learning

What’s an “Anomaly?”● Abnormal Patterns in Data

● Fraud Detection - “Bad credit card Transactions”

● ALSO Fraud detection - Detecting fake locations with call

detail records

● Network Intrusion - Abnormal Activity in a network

● Broken Computers in a data center

Page 3: Anomaly detection in deep learning

Brief Case Studies - eg: Why am I up here?● Telco: http://blogs.wsj.com/cio/2016/03/14/orange-tests-deep-

learning-software-to-identify-fraud/

● Network Infrastructure: https://insights.ubuntu.

com/2016/04/25/making-deep-learning-accessible-on-

openstack/

Page 4: Anomaly detection in deep learning

Network Infra - Save time and Money avoiding Broken workloads by auto migration before it happens

Page 5: Anomaly detection in deep learning

Why Deep Learning?● Learns well from lots of data

● Own feature representation: Robust to noise and allows for

learning cross domain patterns

● Already applied in ads: Google itself invests lots in this same

kind of pattern recognition (targeting/relevance)

Page 6: Anomaly detection in deep learning

Techniques● Unsupervised - Use autoencoder reconstruction error and use moving averages

use dropout with a set time window

● Supervised - RNNs Learn from a set of yes/nos in a time series. RNNs can learn

from a series of time steps and predict when an anomaly is about to occur.

● Use streaming/minibatches (all neural nets can learn like this)

Page 7: Anomaly detection in deep learning

Some definitions● Reconstruction Error: Autoencoders can learn from

unsupervised pretraining and learn how to reconstruct data.

Minimize KL Divergence (the delta between two probability

distributions

● RNN/Time Series: See http://deeplearning4j.org/usingrnns

Page 8: Anomaly detection in deep learning

Production● Kafka/Spark Streaming/Flink/Apex

● Neural net works as consumer of streaming updates

● Data? Mostly log ingestion, could be video

Page 9: Anomaly detection in deep learning

Questions?Email: [email protected]

Twitter: agibsonccc

Github: agibsonccc

Page 10: Anomaly detection in deep learning

Upcoming talksHadoop Summit: San Jose http://hadoopsummit.org/san-jose/ourspeakers/