Download - A.Levenchuk -- Machine learning engineering
Machine Learning Engineering
Anatoly Levenchuk
Copyright © 2016 by Anatoly Levenchuk. Permission granted to DeepHack and INCOSE to publish and use.
What is machine learning as a human activity?
• Ontological question (Aristotle definition: via class-subclass specialization)
• Why it is important?– How to pay? [grants, investmetns, charity]– How to teach? [science, engineering, arts/crafts]– How to name and distinguish in communication
(hiring – participation in division of labor)?
2
How you name yourself to colleagues, when hacking machine learning system?
Machine learning is a…• Science! MSc. in Machine Learning [BigData]• Research?• Engineering?• Art?
Programming is a…• Science? Computer science, MSc.• Research? Computer science, MSc.• Engineering? Software engr., MSc. and MSE!• Art? Master of Art in Mathematics!
3
http://www.computer.org/web/education/professional-competency-certifications
https://www.kaggle.com/competitions
Test:Why is my program not working?
Why is my program not
working?
You need to know why?
To repair compiler?
Software engineer (systems)
To advance theory?
Computer Scientist
You need to program working
properly?
Software engineer
(application)
4
Science
Resulting in models, descriptions (theories), ontologies:• M0 – manufacturing (not science!). Programmers are
engineers: software is physical system!• M1 – design/applied research (Edison)• M2 – basic research (Einstein)• M3 – philosophical logic/mathematics
• There are multiple meta-levels. • Scientists produce these meta-descriptions 5
Engineering• Engineering – discipline, art, skill and profession of acquiring and
applying scientific, mathematical, economic, social, and practical knowledge, in order to design and build structures, machines, devices, systems, materials and processes that safely realize improvements to the lives of people.
• Engineering is the application of mathematics, empirical evidence and scientific, economic, social, and practical knowledge in order to invent, innovate, design, build, maintain, research, and improve structures, machines, tools, systems, components, materials, and processes. 6https://en.wikipedia.org/wiki/Engineering
https://en.wikipedia.org/wiki/Outline_of_engineering
7
Data scientists – ML Engineers
Model/Theory [metamodel]
Engineering/Applied Research
Reality/Data/Model
Science/Basic Research
If it is not about budgeting and social status, it need not to distinguish science and engineering! Practice both of them!
8
Engineering for science
http://blogs.nvidia.com/blog/2016/01/12/accelerating-ai-artificial-intelligence-gpus/
Scientists are mere owner-operators of instruments. Who built the Big Hadron Collider?Experiments order by scientists, builds and carried by engineers, interprets by scientists.
The sunset of the professions, not jobs!
9
• Life-long• Special education• No other professions in a mix
• Several years long• Additional training• One competence in the mix
Machine learning engineering is not a profession. It is a competency!
Machine learning (systems) engineering• Control (systems)
engineering• Machine Learning
(systems) engineering
10
?http://www.payscale.com/research/US/Job=Controls_Engineer/Salary
• Systems Engineer (IT)• Cognitive/Machine Intelligence
Systems Engineer
?
What about jobs?
11Algorithms + Data Structures = Programs (Niklaus Wirth)Scientist is not an engineer, data is not a system
Kind of Engineerings• Mechanical engineering• Agriculture engineering• Aerospace engineering – aircraft architecture• Systems engineering• System of systems engineering• …• Software engineering• Control [systems] engineering – control [system] architecture• Knowledge engineering -- architecture• Machine learning [system] engineering• …• Neural engineering • neural network engineering -- neural [network] architecture• Feature engineering -- ???
12
13
Systems, Software, Machine Learning Engineerings• Systems engineering [Bell Labs in 1940s, boosted as a
profession by NCOSE 1990]• Software engineering [term appeared in 1965, boosted by
NATO as a profession in 1968]• Machine learning engineering [term appeared in 2011]
https://www.google.com/trends/explore#q=machine%20learning%20engineering&cmpt=q&tz=Etc%2FGMT-3
Conversion of engineeringsandDisruption of engineerings
14
Systems
Engineering
Control Engineering
Software Engineering
Machine Learning
Engineering
???
Janosh Szepanovits. Convergence: Model-Based Software, Systems And Control Engineering
+
http://www.infoq.com/presentations/Model-Based-Design-Janos-Sztipanovits
Le Bottou – «Machine Learning disrupts software engineering»http://leon.bottou.org/slides/2challenges/2challenges.pdf
We can add:• Machine learning disrupts
systems engineering• Machine learning disrupts
control engineering• …• Machine learning disrupts
contemporary engineering
Can we use systems and software engineering wisdom in MLE?Le Bottou http://leon.bottou.org/slides/2challenges/2challenges.pdf
• Models as modules: problematic due to weak contracts (models behave differently on different input data)
• Learning algorithms as modules: problematic due to output depends on the training data which itself depends on every other module
Engineering is not only about modularity and modular synthesis! What about other aspects?!• More attention to left part of V-diagram• Optimizations later• …• What else?
15
16
Technical Debt
Machine Learning:The High-Interest Credit Card of Technical Debthttp://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43146.pdf
Hidden Technical Debt in Machine Learning Systemshttp://papers.nips.cc/paper/5656-hidden-technical-debt-in-machine-learning-systems.pdf
• Hack now, pay later (with interest, of course!).• Based on heuristics from software engineering (same
approach as our: usage of software and systems engineering wisdom in machine learning engineering).
• Set of domain-specific heuristics for machine learning
Bionics and machine learning systems engineering• In short: brain is only an inspiration, not a model
for reproducing!• There are other “learning systems engineerings”:
e.g. neural engineering (https://en.wikipedia.org/wiki/Neural_engineering).
• AGI (artificial general intelligence) is a far goal, but magnet for freaks of all sorts. Better not mention it.• Biologically plausible machine learning is about
science, not engineering.
17
18
Knowledge engineering• Ontology engineering (manually)• Solutions are (manually) programmed.
• Example: robot-«butterfly», https://youtu.be/kyvW5sOcZHU, https://youtu.be/V30e77x8BQA
– Every type of movement should be programmed anew– Non-adaptable to changes of environment and device– The best science available up today!
– Perfect, if CPS perform only one or two movements. Not for robots, definitely!
• No learning!
19
Tribes
Shallow LearningBig Data
Deep Learning
Neuroevolution
Bayes Army
Symbolic
Our definition of complexityComplex system – the one that does not fit in the sole engineer’s head, thus collaboration of a team and automation of a knowledge work are mandatory.
E.g.: • Aircraft• programming-in-the small vs.programming in the large• VLSI – very large scale integration, more than 1000 transistors
on a single chip (now transistor count is more than 20bln. – FPGA Virtex-Ultrascale XCVU440)
• Artificial neural network – 16bln. parameters.
20
Comlexity
• Systems Engineering • Machine learning
21
Complex system: not fit into one (hundred) heads for its development. Stellarators, Tokamaks, BHC, aerospace and VLSI engineering.IBM Watson (up to 2011): team of 40.
Still not very complex from engineering point of view.
http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/http://787updates.newairplane.com/787-Suppliers/World-Class-Supplier-Quality
22
CNN Architecture/complexity Growth19982012
9/2014
2/2015
12/2015
9/2014
http://josephpcohen.com/w/visualizing-cnn-architectures-side-by-side-with-mxnet/
LeNet 28*28LeNet 28*28
VGG 224x224
GoogLeNet 224x224
Inception V3 299x299
Inception BN 224x224
AutoML• Generative design/architecturing of networks• Bayesian convergence• Neuroevolution• Dynamic neural description languages (e.g.
Chainer)
23
Automatization of machine learning, CAMLE (computer-aided machine learning engineering) is the main trend of today and tomorrow!
Master AlgorithmPedro Domingos [module/construction]
• Symbolic• Evolution• Connectivist• Bayesian• Analogy[No free lunch!]
Sarath Chandar [component/function]
• multi-task learning• transfer learning• zero-shot/one-shot
learning• multi-modal learning• reinforcement learning
24
http://apsarath.github.io/2016/01/19/agi/ http://www.amazon.com/dp/0465065708/
Intellect-stack is only about one aspect of a whole intellect system.
Intellect-stack is about Platforms (modules) = «how to make it»
Based on Fig.3 ISO 81346-1
-Modules
=Components
+Allocations
25
Modules and interfaces: platforms/layers
Stack
Platform
• This is module viewpoint («how to make»)• Platform is a technology stack layer• Cohesive set of modules with published API• Can be based on top of other platform
26
27
Intelligence Platform Stackand machine learning engineering in it
Application (domain) Platform
Cognitive Architecture Platform
Learning Algorithm Platform
Computational library
General Computer Language
CPU
GPU/FPGA/Physical computation Drivers
GPU/FPGU/Physical computation Accelerator
Neurocompiler
Neuromorphic driver
Neuromorphic chip
Disr
uptio
n en
able
rsDisruption dem
and
Thanks for computer gamers for their disruption demand to give us disruption enabler such as GPU!
Alternative deep learning stack (as viewed by GPU hardware people)
28http://www.nextplatform.com/2015/12/07/gpu-platforms-emerge-for-longer-deep-learning-reach/
• No cognitive and application levels• Languages unimportant• Chassis, backplane, blades importans
(separate layer)• No neuromorphic processing
Hardware Acceleration (except GPU)Is this machine learning engineering? No! But…
• Algorithm-dependent• Need compilation (drivers)• Speed rules• Power rules• Scale rules
• GPU• FPGA• ASIC• Neuromorphic chips• Physical computing
29
http://lighton.io/
• Approximating kernels at the speed of light
http://arxiv.org/abs/1510.06664
Analog, optical device, that performs the random projections literally at the speed of light without having to store any matrix in memory. This is achieved using the physical properties of multiple coherent scattering of coherent light in random media.
• Towards Trainable Media:Using Waves for Neural Network-style Learning
• Bitwise Neural Networks http://arxiv.org/abs/1601.06071
• Conversion of Artificial Recurrent Neural Networks to Spiking Neural Networks for Low-power Neuromorphic Hardware http://arxiv.org/abs/1601.04187
http://arxiv.org/abs/1510.03776
General Computer LanguageComputer science + Software engineering
• Important! Separate layer in intellect-stack!• 2 language problem
• experiment and production, like deep learning frameworks (speed)• «Wrappers» in libraries (thresholds in understanding of a full stack up to hardware bottom)
• My preference: Julia (http://julialang.org/)• Scientific computing is design goal of Julia, MATLAB-similar syntax• 2 language problem solved (speed of computation as in C, speed of writing code as in
Python)• Extensive mathematical function library, Base library and external packages in native Julia• Parallel computing supported (GPU supported too)• Not object-oriented, using multiple dispatch as expression problem solution (good
modularity)• Version 0.4.3 now (1.0 expecting in one year)• Caution: slightly more complex than Python, should not be your first computer language…• MXNet deep learning framework have Julia wrapper
• DSL for deep learning is not General Computer Language• Probabilistic programming languages -- http://probabilistic-programming.org/wiki/Home• DNN description languges, like in CNTK -- https://github.com/Microsoft/CNTK
30
Computation libraries/frameworks/platformsNot a machine learning engineering!• Computation libraries Drivers+Hardware (GPU, clusters)• Linear algebra, optimization, autodiff, symbolic computations, etc.• Can be standalone platform, thus differ from machine learning libraries (general
algorithms for multiple purposes: bioinformatics, physics, astronomy, engineering, machine learning etc.)
• Deep learning frameworks often includes such a library (Torch, Theano, …).
• Scikit (NumPy, SciPy, and matplotlib)• Nd4j (n-dimentional arrays for Java)• Julia packages• …• Non-opensource: Mathematica, Maple…
31
Machine learning is “yet another domain modules and DSL” for them!
Learning algorithm frameworks (not systems)!Machine learning engineering!
• Gentleman algorithm set (CNN, RNN,…)• Updating with an arxiv.org papers rhythm!• Network description language – DSL for machine learning engineering• Experiments and production (scalable!)• Extensibility (on base of general computing language and scientific computing library:
on base of another layer platform in intellect-stack)• Presented as The Machine Learning Platform (including all lower levels assembled and
tuned) • There are hundreds of its: no less then «web frameworks» in early web
32http://www.slideshare.net/yutakashino/ss-56291783
• Google• Facebook• Microsoft• Baidu• IBM• Samsung• …
+ standard datasets for comparisons and benchmarking
+ other tribes platforms
Construction (type of modules) in machine learning
• Deep learning classics (DSL in deep learning frameworks)
• Probabilistic languages http://probabilistic-programming.org/, https://probmods.org/
• Deep learning and Bayesian conversion -- ) http://www.nextplatform.com/2015/09/24/deep-learning-and-a-new-bayesian-golden-age/, http://blog.shakirm.com/2015/10/bayesian-reasoning-and-deep-learning/, http://arxiv.org/abs/1512.05287
• Differentiable languages and datatypes http://colah.github.io/posts/2015-09-NN-Types-FP/, http://www.blackboxworkshop.org/pdf/nips2015blackbox_zenna.pdf, http://arxiv.org/abs/1506.02516
• …• Blends and hybrids of many other learning
architectures
33
Varieties in representations: in deep learning abstraction is architecturally layered, in other approaches it different!
Algorithm platform + Hardware platform = Algorithm platform (hardware is not visible for a platform user, but still matter!)
34http://blogs.microsoft.com/next/2016/01/25/microsoft-releases-cntk-its-open-source-deep-learning-toolkit-on-github/
Cognitive systems/architecturesLearning, communications, reasoning, planning
• Cognitive = knowledge processing. Knowledge is information that is useful in variety of situations. • Cognitive architecture/system is a platform for multiple application
systems.• Ensembles of learning algorithms: it is close to cognitive systems
engineering
• Cognitive systems engineering is a machine learning systems engineering plus something else • Something else: e.g. knowledge engineering: manual coding
(formalization) of knowledge.• Machine learning systems engineering is not cognitive systems
engineering, it is smaller!35
Machine Learning and Cognitive Level• «deep learning research is likely to continue its
expansion from traditional pattern recognition jobs to full-scale AI tasks involving symbolic manipulation, memory, planning and reasoning. This will be important for reaching to full understanding of natural language and dialogue with humans (i.e., pass the Turing test). Similarly, we are seeing deep learning expanding into the territories of reinforcement learning, control and robotics and that is just the beginning» -- Joshua Bengio
https://www.quora.com/Where-is-deep-learning-research-headed
36
If we can learn to reason, plan, model, act – then machine learning engineering will be cognitive systems engineering!
Machine intelligence vs. artificial intelligence
Example: MANIC A Minimal Architecture for General Cognition (http://arxiv.org/abs/1508.00019)
• Keywords: action, planning, observation, decisions, knowledge, …• Is it keywords for
learning systems engineering?
37
Application level of intellect-stack• Killer application for learning systems is here!• Domain specificity and data is here!• End users and money are here!• Systems engineering is here!
38This chart is only about enterprise AI systems market.https://www.tractica.com/newsroom/press-releases/artificial-intelligence-for-enterprise-applications-to-reach-11-1-billion-in-market-value-by-2024/
If you have no application of interest, there will be no data, no money, no developments, no engineering.
Most machine learning engineering is applied. Only small part is machine learning platform development.
Application level: systems engineering• Strategizing and Conceptual
design• Requirements engineering• System Architecture• V&V• Configuration management
• Machine learning engineers is one of multiple engineers that participate in a cyber-physical system project team.
39
SensorsConsoles
http://www.nist.gov/el/nist-releases-draft-framework-cyber-physical-systems-developers.cfm
ActuatorsMonitors
Life cycle stages dictionary
• Conception• Design• Manufacturing• Integration• Validation and verification• Operation
40
Machine learning Systems engineeringConception and requirements
Conception and requirements
Architecture and Design Architecture and DesignTraining ManufacturingTransfer learning, ensembling
Integration
Validation and verification Validation and verificationInference Operation
Stakeholders concerns
Domain-specific concern:• Expressivity • Computational efficiency• Trainability• Good generalization (not overfitting)Traditional concerns• Composability – layering, ensembling• Compositiality – transfer learning• Resilience
41
Intellect-stack and machine learning (systems) engineering• Machine learning (systems) engineering cover now only small
part of the whole intellect-stack but interact with all levels.• No one Googbookdu can develop all levels in intellect-stack
platforms (from hardware accelerators in the bottom up to application on the top) by itself. Maybe except IBM that can span from TrueNorth to IBM Watson applications ;-)
• Interfaces from supporting platforms will be stabilizing and… in constant update (like in software engineering APIs: change of everything once in 5 years).• Technology disruption starts with low (enabling) levels of a
stack, demand provides from upper level, thus nobody in the middle can ignore developments in other layer platforms.
42
43
Intellect-Stack
Application (domain) Platform
Cognitive Architecture Platform
Learning Algorithm Platform
Computational library
General Computer Language
CPU
GPU/FPGU/Physical computation Drivers
GPU/FPGA/Physical computation Accelerator
Neurocompiler
Neuromorphic driver
Neuromorphic chip
Disr
uptio
n en
able
rsDisruption dem
and
Where are you now? Where are you tomorrow?
44
Thank you!
Anatoly Levenchuk,TechInvestLab, presidentINCOSE Russian chapter, research directorhttps://ru.linkedin.com/in/[email protected]
Blog in Russian: http://ailev.ru