predictive analytics with streaming sensor data

22
© 2017 MapR Technologies 1 1 MapR Confidential © 2017 MapR Technologies PRACTICAL MACHINE LEARNING PIPELINE USING STREAMING IOT SENSOR DATA Mathieu Dumoulin - MapR, Mateusz Dymczyk - H2O.ai February 7, 2017 @ Tokyo Big Data Analytics 2017

Upload: mathieu-dumoulin

Post on 14-Feb-2017

118 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 11MapR Confidential © 2017 MapR Technologies

PRACTICAL MACHINE LEARNING PIPELINE USING STREAMING IOT SENSOR DATA

Mathieu Dumoulin - MapR, Mateusz Dymczyk - H2O.aiFebruary 7, 2017 @ Tokyo Big Data Analytics 2017

Page 2: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 22MapR Confidential

Mathieu Dumoulin and Mateusz Dymczyk

• Data Engineer @ MapR Technologies

• Previously data scientist and DS team manager, search, NLP and ML engineer

• Software Engineer @ H2O.ai• Previously ML/NLP @ Fujitsu

Laboratories and en-japan inc• Sommelier in training

Page 3: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 33MapR Confidential

The Time for IoT is NOW

Page 4: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 44MapR Confidential

IoT and Industry 4.0: Predictive Maintenance

Page 5: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 55MapR Confidential

Problem Statement and Business ValueSituation: 生産能力を2倍以上にすることがゴールWe want a real-time view to allow factory staff to see which robots will probably fail, before they actually fail.

More Requirements:• Deal with variety of robots (age, maker, function)• Scale to [100-10,000] robots in real-time and multiple factories• Ensure data reliability• Factory staff has low level of IT knowledge

Page 6: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 66MapR Confidential

Demo Pipeline

Page 7: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 77MapR Confidential

Demo Pipeline – Normal State

Page 8: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 88MapR Confidential

Demo Pipeline – Predict Failure

Page 9: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 99MapR Confidential

LP-RESEARCH Motion Sensor• Tokyo-based startup• Hardware R&D for Industry 4.0 applications• Founded by Waseda University Ph.D. grads• www.lp-research.com

Page 10: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1010MapR Confidential © 2017 MapR Technologies

Our Demo – Real Time Robot Failure Prediction… with AR Visualization

Page 11: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1111MapR Confidential

How we Made our Demo1. Machine learning modeling2. Data input

1. Sensor to backend analysis3. Backend data analysis

1. MapR Converged Data Platform2. Streaming Architecture, MapR Streams (Apache Kafka)

4. Data output: visualizing predictions1. Augmented Reality Headset

github.com/mdymczyk/iot-pipeline

Page 12: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1212MapR Confidential

Machine Learning Modeling1. Set the Machine Learning goal

1. Detect abnormal events > 90% accuracy2. Avoid false positives3. Decide output

2. How to reach the goal1. Supervised vs. unsupervised2. Choose algorithm3. Initial dataset exploration4. Data cleaning and feature extraction5. Deal with real-time and large scale

3. Deploy to production1. Use MapR CDP and custom software2. H2O’s export to POJO function

Normal State (OK!)

PREDICT FAILURE

Page 13: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1313MapR Confidential

ML – Looking at the data

Page 14: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1414MapR Confidential

ML – Anomaly Detection • Unsupervised: 教師なし学習 • Anomaly detection: 異常認識• H2O uses autoencoder

algorithm (deep learning)• H2O’s R API for modeling

• Very productive API• Good graphs

• Parameter tuning of models• See

H2O’s training-book on GitHub

Page 15: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1515MapR Confidential

ML – Results

Note: Time window: 200ms, Threshold: 1SD ( 標準偏差 )

Page 16: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1616MapR Confidential

ML – Deploy to Production – Real-time Data

Page 17: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1717MapR Confidential

Open Source Engines & Tools Commercial Engines & Applications

Enterprise-Grade Platform Services

Dat

aPr

oces

sing

Web-Scale StorageMapR-FS MapR-DB

Search and Others

Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability

MapR Streams

Cloud and Managed Services

Search and Others

Unified M

anagement and M

onitoring

Search and Others

Event StreamingDatabase

Custom Apps

HDFS API POSIX, NFS HBase API JSON API Kafka API

MapR Converged Data Platform

Page 18: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1818MapR Confidential

Data Output – Making Predictions

Page 19: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 1919MapR Confidential

Data Output – Making Predictions

Page 20: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 2020MapR Confidential

Conclusion• Getting a good enough model on some data was less than 10%

of the total work.• Team members need to have ALL expertise for this kind of

project. Hardware, software, big data, ML.

• MapR, H2O and LP-RESEARCH’s sensor were all essential parts of the project success. – The MapR platform worked perfectly, H2O model is high quality and fast.

• The hardware expertise of LP-RESEARCH was critical

Page 21: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 2121MapR Confidential

Q & AEngage with us!

PROJECT GITHUB: github.com/mdymczyk/iot-pipeline

Mathieu Dumoulin, [email protected] @Lordxar

• Blog: https://www.mapr.com/blog/author/mathieu-dumoulin

Mateusz Dymczyk, H2O.ai, [email protected] @mdymczyk

Klaus Petersen, [email protected]

• LP-RESEARCH: www.lp-research.com

Page 22: Predictive analytics with streaming sensor data

© 2017 MapR Technologies 2222MapR Confidential

Thank you to LP-RESEARCH!

Hardware design and productionExpertise in Motion sensors

Gyroscope

Accelerometer

Magnetometer

Sensor fusion algorithm development

Multi-platform application development

See all our products: https://www.lp-research.com/products/

LPMS-B2 LPMS-CU2

LPMS-CANAL2

LPMS-USBAL2

OEM also available!