smartdata webinar: applying neocortical research to streaming analytics
TRANSCRIPT
APPLYING NEOCORTICAL ALGORITHMS TO STREAMING ANALYTICS
SmartData Webinar September 10, 2015
Subutai Ahmad [email protected]
Revenue Forecasting Customer Story
10pm:
Team of 10 analysts
5 am: “Dear CEO, today’s revenue forecast is $63.4M.”
Objectives for next generation: Generate predictions every 15-minutes Track all product categories and geographies (hundreds of thousands) React rapidly to changes
Problems:
Cumbersome data infrastructure Algorithm approach completely unclear
Revenue Forecasting Customer Story
10pm:
Team of 10 analysts
5 am: “Dear CEO, today’s revenue forecast is $63.4M.”
“How Machine Learning Is Done”
Data Prep
Craft Input Features
Training Methodology
Choose Algorithm
Test & Validate
“How Machine Learning Is Done”
Data Prep
Craft Input Features
Training Methodology
Choose Algorithm
Test & Validate
Deploy
Streaming data
Automated model creation Continuous learning Temporal inference
Predictions Anomalies Actions
The Future of Data Analytics
Solution:
Streaming data infrastructure
New algorithm approach
Numenta History
2005 – 2009 § Hierarchical Temporal Memory theory
§ First generation algorithms § Vision Toolkit
2002
2004
2009 – 2014 § 2nd generation HTM algorithms
§ Sequence & cont. learning § Streaming data applications
§ HTM open source project
§ Grok 1.0 for anomaly detection
2014 – Today § Streaming applications
§ Grok for Stocks § Research on 3rd generation algs
§ Sensorimotor
§ Feedback
2005
Properties Of The Neocortex
retina
cochlea
somatic
data stream
motor control
“Hierarchical Temporal Memory” (HTM)
Properties Of The Neocortex
1) Hierarchy of nearly identical regions - common algorithm
retina
cochlea
somatic
2) Sparse Distribution Representations - common data structure
data stream
3) Regions are mostly sequence memory - inference - motor motor control
4) Every region is continually learning - fully automated
“Hierarchical Temporal Memory” (HTM)
HTM Learning Algorithm
Models a small slice of cortex 1) High capacity memory-based system 2) Models complex high-order temporal sequences 3) Makes predictions and detects anomalies 4) Continuously learning 5) No sensitive parameters 6) Runs in real time on a laptop
Basic building block of neocortex and Machine Intelligence Whitepaper and full source code available: github.com/numenta
HTM
Encoder SDR
Prediction Point anomaly Time average Historical comparison Anomaly score
Metric(s)
System Anomaly Scores
& Predictions
HTM Engine For Streaming Analytics
HTM
Encoder SDR
Prediction Point anomaly Time average Historical comparison Anomaly score
SDR Metric N
.
.
.
GROK Server anomalies
Rogue human behavior
Geospa6al tracking
Stock & market anomalies
Applications Of The HTM Engine
Social media anomalies (Twi?er)
Grok: Anomaly Detection For Amazon Web Services
§ Unique value of HTM algorithms § Automated model creation: configure hundreds of models in minutes § Continuously learning: automatically adapts to changes § Detects sophisticated temporal anomalies
Continuous learning Unpredictable data Temporal anomalies
HTM for Stocks: Detecting Unusual Market Behavior
Companies sorted by unusual behavior
Stock price Stock volume Twitter chatter
Tweets reveal cause
Anomaly Detection in Geospatial Tracking Data
HTM
Encoder SDRs Prediction Anomaly Detection Classification
GPS+ Velocity
Anomaly Detection in Geospatial Tracking Data
HTM
Encoder SDRs Prediction Anomaly Detection Classification
GPS+ Velocity
Trick: convert GPS coordinates into an SDR After input is encoded as an SDR, learning algorithm is agnostic
Learning Normal Behavior
Learning Normal Behavior
Learning Normal Behavior
Geospatial Anomalies
Deviation in path Change in direction
Multiple paths are OK Unusual change in speed
Geospatial Anomalies
These HTM Applications Use Exact Same Code Base
HTM learning algorithms Identical learning parameters Wide applicability across sensor types
GROK Server anomalies
Rogue human behavior
Geospa6al tracking
Stock & market anomalies
Social media anomalies (Twi?er)
Benchmarking Streaming Anomaly Detection
Traditional benchmarks don’t apply: – Don’t incorporate -me, e.g. favor early
detec-on over later detec-ons – Usually batch format – Very few benchmarks with real world
data Numenta Anomaly Benchmark (NAB)
– Scoring methodology favors early detec-on
– Incorporates con-nuous learning (learning a new normal baseline)
– Labeled real world data streams – Different “applica-on profiles”
HTM tested against 3 algorithms
Benchmarking Streaming Anomaly Detection
Benchmarking Streaming Anomaly Detection
HTM detects anomaly earlier
Other algorithms
Numenta Community & Partnerships
- NuPIC
- Open source community at numenta.org
- > 3,000 Github followers, > 160 contributors
- Cortical.io - Natural Language Processing
- IBM - Core HTM research
- Novel hardware architectures for HTMs
- Avik Partners - HTM Grok anomaly detection and analytics for IT
- grokstream.com
Future of Data is Streaming Data
- High velocity sensory streams with rapidly changing statistics - Massive number of models
- Problem: existing batch algorithms cannot scale
Cortical Algorithms Show The Way - Proof that systems can:
Automatically create models
Continuously learn Model sophisticated temporal streams
- HTM learning algorithms implement cortical principles - Can demonstrate working applications today
Thank you!
Contact info: [email protected]