h2o machine learning and kalman filters for machine prognostics
TRANSCRIPT
H2OMachine Learning andKalman Fi lters for Machine Learning
HankRoark@[email protected]
WHOAMI ILead,[email protected]
JohnDeere:Research,SoftwareProductDevelopment, HighTechVenturesLotsoftimedealingwithdataoffofmachines,equipment, satellites, weather,radar,handsampled, andon.Geospatial, temporal/timeseries dataalmostall fromsensors.Previouslyatstartupsandconsulting (RedSkyInteractive,Nuforia,NetExplorer, PerotSystems,afewofmyown)
Engineering&ManagementMITPhysicsGeorgiaTech
[email protected]@hankroarkhttps://www.linkedin.com/in/hankroark
IF YOUARE INTO DATA, THE IOTHAS IT
WHYTHIS EXAMPLE?
GETREADY FORBRONTOBYTES!!
WOW,HOWBIG ISABRONTOBYTE?
Image courtesy http://www.telecom-cloud.net/wp-content/uploads/2015/05/Screen-Shot-2015-05-27-at-3.51.47-PM.png
This much data wil l requirea fastOODA loop
EXAMPLE FROMTHE IOTDomain:PrognosticsandHealthManagementMachine:TurbofanJetEnginesDataSet:A.Saxena andK.Goebel(2008)."TurbofanEngineDegradationSimulationDataSet",NASAAmesPrognosticsDataRepository
PredictRemainingUseful LifefromPartialLifeRuns
Sixoperatingmodes,twofailuremodes,manufacturingvariability
Training:249jetengines runtofailureTest: 248jetengines
LOADING DATA
PYTHON (ANDR) OBJECTS ARE PROXIES FOR BIGDATA
STEP 1
Python user
h2o_df = h2o.import_file(“hdfs://path/to/data.csv”)
H2O
H2O
H2O
data.csv
HTTP REST API request to H2Ohas HDFS path
H2O ClusterInitiate distributed ingest
HDFSNFSS3
Request data from HDFS
STEP 2
2.2
2.3
2.4
Python
h2o.import_file()
2.1Python function
call
PYTHON(ANDR ) OB JECTSARE PROXIES FOR B IG DATA
H2O
H2O
H2O
Python
HDFSNFSS3
STEP 3
Cluster IPCluster Port
Pointer to Data
Return pointer to data in REST API JSON
Response
HDFS provides data
3.3
3.43.1h2o_df object created
in Python
data.csv
h2o_dfH2O
Frame
3.2Distributed H2OFrame in DKV
H2O Cluster
PYTHON(ANDR ) OB JECTSARE PROXIES FOR B IG DATA
SUMMARY STATISTICS
CREATETHETARGETVARIABLE
Calculate TotalCyclesForEachUnit
CREATETHETARGETVARIABLE
AppendToOriginalFrame
CREATETHETARGETVARIABLE
CreateNewFeatureofCycles
Remaining
EXPLORATORY DATA ANALYSISBooleanIndexing
EXPLORATORY DATA ANALYSISSamplethedatatolocalmemory
EXPLORATORY DATA ANALYSIS
Useyourfavorite
visualizationtools
(Seaborn!)
Ugh,wherearetrends
overtimeTime
ZeroRemainingUsefulLife
MODEL BASEDDATAENRICHMENTSensor
measurementsappearinclusters
Correspondingtooperating
mode!
FEATUREENGINEERING
UseH2Ok-meanstofindcluster
centers
FEATUREENGINEERING
Enrichexisting datawithoperatingmode
membership
MOREFEATUREENGINEERINGFornon-constant
sensormeasurements
withinanoperatingmode,
Standardizeeachsensormeasurementbyoperatingmode
Basedonthetrainingdata
TRENDSOVER TIME!
BeforeH2ODataPreparation
ReadyforH2OLearning
Time Time
MODELING - SIMPLE
ConfigureanEstimator
MODELING - SIMPLE
Train anEstimator
PYTHONSCRIPTSTARTING H2OGLM
HTTP
REST/JSON
POST/3/ModelBuilders/glm
glm_estimator.train()
Pythonscript
StandardPythonprocess
TCP/IP
HTTP
REST/JSON
/3/ModelBuilders/glm endpoint
Job
GLMalgorithm
GLMtasks
Fork/Joinframework
K/Vstoreframework
H2Oprocess
Networklayer
RESTlayer
H2O- algos
H2O- core
Userprocess
H2Oprocess
Legend
HTTP
REST/JSON
GET /3/Models/model_id
glm_estimator.show()
Pythonscript
StandardPythonprocess
TCP/IP
HTTP
REST/JSON
/3/Modelsendpoint
Fork/Joinframework
K/Vstoreframework
H2Oprocess
Networklayer
RESTlayer
H2O- algos
H2O- core
Userprocess
H2Oprocess
Legend
PYTHON SCRIPT STARTING H2OGLM
MODEL EVALUATIONEvaluate Performance
ataglanceinPython
MODEL EVALUATIONEvaluate Performance
ataglanceinH2OFlow
MODEL EVALUATIONEvaluate Performance
ataglancegraphically inPython
CROSS VALIDATION
SetupHyperparameterSearchOptions
CROSS VALIDATION
Configurefullfull
gridsearch
CROSS VALIDATION
Executegridsearch
CROSS VALIDATION
Evaluate results&modelselection
OPEN– SCIKITPIPELINES
Create Pipelines
Hyper-parameter Options
Crossvalidationstrategy
Hyper-parameterSearchStrategy
Fit
KALMAN - POSTPROCESSING
RESOURCES
• Downloadandgo:http://www.h2o.ai/download• Documentation:http://docs.h2o.ai/• Booklets,Datasheet:http://www.h2o.ai/resources/• Github:http://github.com/h2oai/• Training:http://learn.h2o.ai/• ThispresentationandassociatedJupyter notebook
(lookin2016_01_21_MachineLearningAndKalmanFiltersForMachineHealth):https://github.com/h2oai/h2o-meetups/
THANK YOU