h2o world - welcome to h2o world with arno candel
TRANSCRIPT
![Page 1: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/1.jpg)
We l c ome ' t o ' H 2O 'Wo r l d
Sri'&'H2O'Team'
![Page 2: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/2.jpg)
Data Science is a Team Sport!
Culture Matters!
![Page 3: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/3.jpg)
Open Source Breeds Courage!
Community Matters!
Every generation needs to make its own history!
![Page 4: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/4.jpg)
Code is conversation with Customer!
Great Product Matters!
![Page 5: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/5.jpg)
Accuracy with Speed and Scale
HDFS%
S3%
SQL%%
NoSQL%
CLASSIFICATION%REGRESSION%
FEATURE%ENGINEERING%
IN4MEMORY%
MAP%REDUCE/FORK%JOIN%
COLUMNAR%COMPRESSION%
DEEP%LEARNING%
PCA,%GLM,%COX%
RANDOM%FOREST%/%GBM%ENSEMBLES%
FA S T %MODE L ING % ENG INE %
Streaming% NANO % FA ST % JAVA % S COR ING % ENG INES %
MATRIX%FACTORIZATION% CLUSTERING%
MUNGING%
![Page 6: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/6.jpg)
What ’s New in H2O-‐3
H2O-‐3 vs H2O-‐2: • Total rewrite of the core in Java: built for data scientists AND developers! • Unique Flow GUI (Notebook and more) • REST Schemas for self-‐describing API for all methods/algos • New R client: cleaner, faster • Sparkling Water: H2O is the Killer App on Spark • Fully featured Python client (incl. Pipelines, scikit-‐learn look&feel) • New expression parser & backend execution engine for R, Py, Flow • New Algo: GLRM -‐ Generalized Low Rank Modeling(unifies PCA, K-‐Means, Matrix Factorization, Imputation, etc.)
• New Solvers for GLM: Coordinate Descent and L-‐BFGScontinued…
![Page 7: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/7.jpg)
What ’s New in H2O-‐3
Additional New Features: • Grid Search for all Algorithms (R/Py/Flow) • N-‐fold Cross-‐Validation for all Algorithms • Early Stopping (check for convergence) for GBM/DRF/DL • Stochastic GBM (row/col sampling) • Distributions (Gaussian, Laplace, Poisson, Gamma, Tweedie) for GBM/DL • Improved sparse data handling for DL • Multi-‐node auto-‐tuning for DL • Multinomial GLM • Scalable Scatter Plots for numeric and categorical data • Big-‐Big Joins (“distributed data.table”) -‐ in QA
…and many more!
![Page 8: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/8.jpg)
Convergence-‐Based Early Stopping in H2O
Before: trains too long, but at least overwrite_with_best_model=true prevents overfitting (returns the model with lowest validation error)
Now: specify additional convergence criterion: E.g. stopping_rounds=5, stopping_metric=“MSE”, stopping_tolerance=1e-‐3, to stop as soon as the moving average (length 5) of the validation MSE does not improve by at least 0.1% for 5 consecutive scoring events
validation error
training error
overwrite_with_best_model=true
training time / epochs
training time / epochsUse Flow to inspect the model
Early stopping saves tons of time
Best Model
Deep Learning with Higgs data
![Page 9: H2O World - Welcome to H2O World with Arno Candel](https://reader031.vdocument.in/reader031/viewer/2022021815/586f78e11a28ab10258b6d83/html5/thumbnails/9.jpg)
What do these st ickers mean?
I have H2O Installed
I have Python installed
I have R installed
I have the H2O World data sets
P i ck up s t i cke rs o r get i n s ta l l he lp a t the in fo rmat ion booth