machine learning 101 dkom 2017
TRANSCRIPT
![Page 1: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/1.jpg)
Machine Learning 101
Fred Verheul
![Page 2: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/2.jpg)
2
Machine Learning
"Field of study that gives computers the ability to learnwithout being explicitly programmed” (Arthur Samuel, 1959)
![Page 3: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/3.jpg)
3
What is Machine Learning?
Computer
Computer
Traditional Programming
Machine Learning
Data
Data
Program Output
ProgramOutput
![Page 4: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/4.jpg)
4
Prediction is hard…
![Page 5: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/5.jpg)
5
Sweet spot for Machine Learning
• It’s impossible to write down the rules in code:• Too many rules• Too many factors influencing the rules• Too finely tuned• We just don’t know the rules (image recognition)
• Lots of labeled data (examples) available (e.g. historical data)
![Page 6: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/6.jpg)
6
Basic Machine Learning ‘workflow’
Feature Vectors
Training data
Labels
Machine Learning Algorithm
Feature Vectors
New data Prediction
Training Phase
Operational Phase
Predictive Model
![Page 7: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/7.jpg)
7
Training Phase in more detail
Raw dataData
preparation Feature Vectors
Training Data
Test data
Model Building (by ML
algorithm)
Model Evaluation
Predictive Model
Feedback loop
data cleansingdata transformation
normalizationfeature extraction
aka ‘learning’
![Page 8: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/8.jpg)
8
Examples of ML tasksSupervised learning
Regression target is numeric
Classification target is categorical
Unsupervised learning
Clustering
Dimensionalityreduction
![Page 9: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/9.jpg)
9
Modeling: so many algorithms…
![Page 10: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/10.jpg)
10
ML Algorithms: by RepresentationCollection of candidate models/programs, aka hypothesis space
Decision trees
Instance-based
Neural networks
Model ensembles
![Page 11: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/11.jpg)
ML Algorithms: by Evaluation
Evaluation: Quality measure for a model
11
Regression
Example metric: Root Mean Squared Error
RMSE =
Binary classification: confusion matrix
Accuracy: 8 + 971 -> 97,9%
Example: medical test for a disease
Accuracy: Better evaluation metrics:• Precision: 8 / (8 + 19)• Recall: 8 / (8 + 2)
![Page 12: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/12.jpg)
12
Optimization: how the algorithm ‘learns’, depends on representation and evaluation
ML Algorithms: by Optimization
Greedy Search, ex. of combinatorial optimization
Gradient Descent (or in general: Convex Optimization)
Linear Programming (or in general:Constrained/Nonlinear Optimization)
![Page 13: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/13.jpg)
13
Training error vs test error
![Page 14: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/14.jpg)
14
Data Science for Business
• Focuses more on general principles than specific algorithms
• Not math-heavy, does contain some math
• O’Reilly link: http://shop.oreilly.com/product/0636920028918.do
• Book website: http://data-science-for-biz.com/DSB/Home.html
![Page 15: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/15.jpg)
15
What has NOT been covered (1)
• Deep learning / Neural Networks
• Covered in other presentations at DKOM
• Also recommended for further reading (deep dive):• http://neuralnetworksanddeeplearning.com/index.html
• Specifics of ML-algorithms
• All over the internet… e.g. at http://machinelearningmastery.com/
![Page 16: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/16.jpg)
16
What has NOT been covered (2)
• Libraries (examples):• Tensorflow, Caffe, Theano, Keras• SciPy & scikit-learn• Spark MLLib (Scala/Java/Python)
• Programming languages:
![Page 17: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/17.jpg)
17
What has NOT been covered (3)
• SAP products:
• SAP HANA, SAP HANA Vora, SAP BO Predictive Analytics(!), HCP Predictive Services
• New machine learning platform
• Hardware
• Nvidia talk about GPUs
![Page 18: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/18.jpg)
18
What has NOT been covered (4)
• Ethics and algorithmic transparency:
![Page 19: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/19.jpg)
19
What has NOT been covered (5)
• The Data Science & Data Mining Process:
![Page 20: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/20.jpg)
20
What has NOT been covered (6)
• How to integrate ML into your business application
• I hope SAP is figuring that out as we speak ;-)
• Have a look at SAP Predictive Analytics Integrator
• https://help.sap.com/pai
![Page 21: Machine learning 101 dkom 2017](https://reader038.vdocument.in/reader038/viewer/2022102821/587d86b91a28abcd648b5437/html5/thumbnails/21.jpg)
21
Take-aways
• Goal of ML: generalize from training data (not optimization!!)
• No magic! Just some clever algorithms…
• Increasingly important non-technical aspects:• Ethics
• Algorithmic transparency