ml_big_picture-2.0.pptx
TRANSCRIPT
Machine Learning Big pictureFrancis Pieraut – Oct 2016
My startup bias about efficiency
Plan
1. My path in ML2. AI big picture (expert systems to ML)3. ML trends over time 1980-2008+4. Type of ML (supervised vs unsupervised)5. Relationship Data-mining vs ML6. Training process7. regularization technique8. ML research big picture9. What is this Deeplearning Revolution?
10. ML in practice -> feature engineering11. Importance of the cost function12. Data importance -> NIPS 2009 13. The tagging nightmare14. ML & Optimization 15. Adversarial Examples
Francis Evolution in ML
• 1999 – Decision Tree expert (Samy Bengio)• 2001-2003 – Research with Bengio (huge networks) -> flayers• 2003 – Idilia -> Importance of good tagged dataset and features &
overfitting • 2005-2006 – Dakis -> KISS (ML not required & importance of
comprehensive knowledge) – Expert System• 2006-2009 – Data-Mining (Understand first & features extraction)…MLboost• 2010-2013 – QMining -> big-data mining • 2003-2016 – Nuance -> Data Maturity & Data-driven design
Data Maturity model reminder
AI big picture
Type of ML
Parametric Non-Parametric
Reinforcement
ML trends over time 1980-2008+
http://fraka6.blogspot.com/2013/10/deep-learning-history-and-most.html
10 main ML algo
• Naïve Bayes Classifier Algorithm• K Means Clustering Algorithm• Support Vector Machine Algorithm• Apriori Algorithm• Linear Regression• Logistic Regression• Artificial Neural Networks (gradient)• Random Forests• Decision Trees (info theory)• K Nearest Neighbors
***Machine learning dangerous hype****
Traps and Pitfalls
Data-Mining vs Machine Learning
Traininig Process
Classification error over time
Training
Regularisation technique
• Regularization is a technique used in an attempt to solve the overfitting [1] problem in statistical models.*
• Exemple:– Early stopping– Decrease constant– Dropout– Mini-batch– Better cost function (ex: margin vs MSE)
What is tough about ML
• More parameters = more examples are required
• Tagged data is hard to create compare to untagged data
• There is no magic -> Feature engineering• Better features -> less examples -> less
capacity problem • Getting good example sampling (don’t
introduce bias)
Example feature engineering
Example feature engineering
What is this Deeplearning Revolution?
• Deep architecture are more powerful then shallow architecture
• Before 2006 we couldn’t train deep architecture• Revolution
– Convolution NN – Train generative models (Auto-encoder) -> learn the
data constraints…..unsupervised learning… (better parameters initialization)
– STD Training
Example of deep learning in images
ML learning in practice
• Black box = recipe for a disaster• 90% feature engineering• ML = automatic tuning• Garbage in = Garbage out• Tagging is a pain….manual work
Importance of the cost function
• Neural network cost functions (back prop)– MSE & Log soft max– Example NETFLIX & recommendation
• Optimization– SVM = Maximize Margin
Data Importance NIPS 2009
• Google -> that is enough – Parameter optimization; tweaking kernels (SVM)
– More parameters then # examples
– Simpler model + more data = what works
The tagging nightmare
• You still need tagged data• Tagged data is hard to automate + error
prone• Tagged data is error prone (garbage in
garbage out)– Idilia use case– Nuance use case
The lie about ML
• Machine learning != Optimization• Machine learning != Statistics• Machine learning = Optimization problem
with constraints to generalize (regularization)
Adversarial example - Ian Goodfellow (now at open.ai)
Conclusion
• ML is a quite mature field • ML != Deeplearning
– Deeplearning = major breakthrough, hype phase, not mature
• NN = optimization problem with constraints• SP operates more like expert systems• Algo is as good as its inputs -> feature
engineering