![Page 1: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/1.jpg)
Analytics on Time Series Data
Jure LeskovecStanford UniversityChan Zuckerberg BiohubPinterest
![Page 2: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/2.jpg)
2Jure Leskovec
![Page 3: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/3.jpg)
Sensors are Everywhere
§ Sequences of time stamped observations
Jure Leskovec, Stanford 3
Sensors are everywhere
I In many applications, we generate large sequences of timestampedobservations
– “Sensors” have a broad definition
Introduction 2
![Page 4: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/4.jpg)
Sensor Data: Time Series§ Sensors generate lots of time-series
data
Jure Leskovec, Stanford 4
Network inference from time series data
I Convert a sequence of timestamped sensor observations into atime-varying network
Introduction 5
![Page 5: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/5.jpg)
Sensors à Time Series
§ But such time series data are:§ High-dimensional§ Unlabeled§ High-velocity§ Dynamic§ Heterogeneous
Jure Leskovec (@jure), Stanford University 5
Network inference from time series data
I Convert a sequence of timestamped sensor observations into atime-varying network
Introduction 5
![Page 6: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/6.jpg)
Sensors: Much data, no insights§ It is hard to obtain insights:
§ Sensor readings are interdependent and correlated
§ As the environment changes dependencies might change
§ Sensors might fail§ Readings might be asynchronous
Jure Leskovec, Stanford 6
Raw Data Structured Data Learning Algorithm Model
Downstream prediction task
Feature Engineering
Automatically learn the features
![Page 7: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/7.jpg)
Sensors: Much data, no insights
§ Sensor data is hard to work with§ The algorithms must be:
§ Scalable: Large amounts of raw data over long time series
§ Robust: Must apply to lots of different applications
Jure Leskovec, Stanford 7
![Page 8: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/8.jpg)
Challenge: ML & Insights
8
Raw Data Structured Data Learning Algorithm Model
Downstream prediction task
Feature Engineering
Automatically learn the features
§ (Supervised) Machine Learning Lifecycle: This feature, that feature. Every single time!
Jure Leskovec (@jure), Stanford University
![Page 9: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/9.jpg)
9
How do we describe the structure of the time series so we can obtain insights and make predictions?
![Page 10: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/10.jpg)
Discovering Structure§ Value in “breaking down” the time series
into a sequence of states:§ Allows us to draw interpretable conclusions
from the data
10
![Page 11: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/11.jpg)
Segmentation and Clustering§ In general, the “states” are not predefined
§ We do know what they are, nor what they refer to…
§ Instead, we need to discover these states in an unsupervised way, while simultaneously segmenting the time series into the states!
11
![Page 12: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/12.jpg)
How to Describe StatesNetworks are great to encode
structure and dependencies in data
Jure Leskovec, Stanford 12
Network inference from time series data
I Convert a sequence of timestamped sensor observations into atime-varying network
Introduction 5
Sens
ors
Depe
nden
cies
Network Inference via the Time-Varying Graphical Lasso. D. Hallac, Y. Park, S. Boyd, J. Leskovec.ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017.
![Page 13: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/13.jpg)
Our Approach§ Our approach: Given a set of sensor
data, we learn temporal dependency networks between different sensors
§ The network is not static but can change over time§ Allows us to gain insights into the
process and detect anomalies
Jure Leskovec, Stanford 13
![Page 14: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/14.jpg)
States and Dependencies
14
State
Sensor Dependency network of state A, B
![Page 15: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/15.jpg)
TICC Problem Setup§ Formal definition:
where,
15
Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data. D. Hallac, S. Vare, S. Boyd, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017
![Page 16: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/16.jpg)
TICC: Scalability
§ Can scale to problems with tens of millions of observations!
Jure Leskovec, Stanford 16
Scalability
I ADMM can solve for millions of variables in minutes!
– Centralized solver (CVXPY) explodes computationally
Number of Unknowns TVGL Interior-Point100 0.9 37.9360 0.9 5362.7200,000 48.3 -5 million 706.4 -
Results 18
CVXPYSnapVX
SnapVX: A Network-Based Convex Optimization Solver. D. Hallac, C. Wong, S. Diamond, A. Sharang, R. Sosič, S. Boyd, J. Leskovec. Journal of Machine Learning Research (JMLR), 18(4):1−5, 2017.
![Page 17: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/17.jpg)
Case Study: Cars§ Car driving sessions:
§ 36,000 samples @ 10Hz
§ We observed 7 sensors:§ Brake pedal position§ Forward (X-)acceleration§ Lateral (Y-)acceleration§ Steering wheel angle§ Vehicle velocity§ Engine RPM§ Gas pedal position
17
![Page 18: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/18.jpg)
Cars – “Turning” State
18
![Page 19: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/19.jpg)
Cars– “Stopping” State
19
![Page 20: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/20.jpg)
TICC Results: Car Data§ We run TICC with K=5 states
(selected via BIC)§ The betweenness centrality score of
each node in each state/network:
20
![Page 21: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/21.jpg)
Resulting StatesGreen = straight, White = slowing down, Red = turning, Blue = speeding upResults are very consistent across the data!
26
![Page 22: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/22.jpg)
22
Predicting the Future(but without feature engineering)
![Page 23: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/23.jpg)
Problem Setup§ Dataset: Automobile data containing
1,400 sensors recording at 10 Hz.
§ Goal: Predict driver actions 1 sec before they occur§ Left/Right blinker§ Accelerate (gas pedal > threshold)§ Hard braking (brake pedal < threshold)
23
Driver Identification Using Automobile Sensor Data from a Single Turn. D. Hallac, A. Sharang, R. Stahlmann, A. Lamprecht, M. Huber, M. Roehder, R. Sosic, J. Leskovec IEEE International Conference on Intelligent Transportation Systems (ITSC), 2016.
![Page 24: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/24.jpg)
Neural Networks§ Multi-layer RNN architecture:
§ The output of the first LSTM is passed as input to a second LSTM)
§ 3,000h of driving, 1400 sensors, at 10Hz=150 billion datapoints
24
![Page 25: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/25.jpg)
Time-Series Analysis
2/21/17 25
SparseTSV SNAPBinaryFormat
AsynchronousSignalArray
Task-SpecificData
ProcessResults
Space-EfficientDataConversion
DataFiltering
Analysis
![Page 26: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/26.jpg)
Results – Brake Pedal
26
0.934 Test Set AUC (0.935 Training AUC)
§ AUC of predicting whether the driver will press the break pedal:
![Page 27: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/27.jpg)
Predicting Driver Actions
27
![Page 28: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/28.jpg)
Predicting Driver Actions
28
![Page 29: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/29.jpg)
Predicting Driver Actions
29
![Page 30: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/30.jpg)
Failure Prediction§ Dataset: Boiler data containing 100
sensors at 1Hz for 6-12 months§ Goal: Predict component failures 7
days before they occur
30
![Page 31: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/31.jpg)
Preliminary Results§ Very promising results across various
deep learning models!
31
![Page 32: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/32.jpg)
Conclusion§ Complex engineered
systems§ High-dimensional unlabeled
time series data collected in real-time
§ We need tools to understand these data as well as to make accurate predictions
Jure Leskovec, Stanford University 32
Sensors are everywhere
I In many applications, we generate large sequences of timestampedobservations
– “Sensors” have a broad definition
Introduction 2
![Page 33: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/33.jpg)
Thanks!§ Joint work with D. Halla,c Y. Park, S. Boyd, R.
Sosic, S. Vare, A. Sharang, C. Wong, S. Diamond.
§ Papers:§ Network Inference via the Time-Varying Graphical Lasso. D. Hallac, Y. Park, S. Boyd, J. Leskovec.ACM SIGKDD
International Conference on Knowledge Discovery and Data Mining (KDD), 2017.
§ Toeplitz Inverse Covariance-Based Clustering of Multivariate Time Series Data. D. Hallac, S. Vare, S. Boyd, J. Leskovec. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2017.
§ Learning the Network Structure of Heterogeneous Data via Pairwise Exponential Markov Random Fields. Y. Park, D. Hallac, S. Boyd, J. Leskovec. Artificial Intelligence and Statistics Conference (AISTATS), 2017.
§ SnapVX: A Network-Based Convex Optimization Solver. D. Hallac, C. Wong, S. Diamond, A. Sharang, R. Sosič, S. Boyd, J. Leskovec. Journal of Machine Learning Research (JMLR), 18(4):1−5, 2017.
§ Driver Identification Using Automobile Sensor Data from a Single Turn. D. Hallac, A. Sharang, R. Stahlmann, A. Lamprecht, M. Huber, M. Roehder, R. Sosic, J. Leskovec IEEE International Conference on Intelligent Transportation Systems (ITSC), 2016.
§ Network Lasso: Clustering and Optimization in Large Graphs. D. Hallac, J. Leskovec, S. Boyd. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2015.
Jure Leskovec, Stanford University 33
![Page 34: Analytics on Time Series Data · 2017-11-09 · §Goal:Predict driver actions 1 sec before they occur §Left/Right blinker §Accelerate (gas pedal > threshold) §Hard braking (brake](https://reader034.vdocument.in/reader034/viewer/2022042321/5f0b5a8b7e708231d430198b/html5/thumbnails/34.jpg)
http://snap.stanford.edu@jure
THANKS!
Jure Leskovec, Stanford University 34