democratizing access to machine learning with pre-trained ......ibm watson studio paperspace h20 -...
TRANSCRIPT
![Page 1: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/1.jpg)
Democratizing access to Machine Learning with Pre-Trained Models
and Cloud ServicesJay Bartot – CTO, Madrona Venture Labs
![Page 2: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/2.jpg)
Who Am I?• CTO of Madrona Venture Labs
• Serial Technology Entrepreneur
• Machine Learning Enthusiast
• Started learning about ML in 2001
• Lots of startups and acquisitions over last 20 years
• Classification, Forecasting, Text-mining, Computer vision
• eCommerce, Online Advertising, Travel, Medical informatics, Consumer video
![Page 3: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/3.jpg)
What I work on
• We look for technology startup ideas that are aligned with Madrona
Venture Group’s investment thesis
• Our critical work: Carefully vetting the ideas for market, customer,
technology fit
• Most ideas ultimately don’t pass muster – so we kill them!
• Main focus: Vertical AI/ML companies
![Page 4: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/4.jpg)
Data science (DS) and Machine learning (ML) Landscape
Many once extremely challenging problems now have robust solutions due to:
• Big data
• Compute power
• Reemergence of neural network learning algorithms
![Page 5: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/5.jpg)
DS/ML Landscape
• Explosion of open source toolkits and platforms – phenomenal!
• Amazing how quickly new ML techniques and tools become commoditized
• Lots of new educational material and resources
• But DS/ML is still complex with significant learning curve
![Page 6: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/6.jpg)
DS/ML Landscape
• Big learning curve
• Somewhat of a black art• Critical work:
• Data cleaning, normalizing • Feature engineering• Hyperparameter tuning• Testing/Validation
• Hosting models for inference
• Time consuming and expensive!
![Page 7: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/7.jpg)
Machine learning pipeline – Phase 1Data Prep
![Page 8: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/8.jpg)
Machine learning pipeline – Phase 2Model Training
![Page 9: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/9.jpg)
Machine learning pipeline – Phase 3Inference
![Page 10: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/10.jpg)
Full featured DS/ML Cloud platforms
AMZ SageMaker
Google ML Engine
Azure ML Studio
IBM Watson Studio
Paperspace H20 - DriverlessAI
![Page 11: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/11.jpg)
But there is a problem
![Page 12: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/12.jpg)
Democratizing DS/ML: Empowering the Rest of Us
• Proliferation of pretrained models and services
• From files/libraries to cloud services
• Can the paradigm of software reusability and
encapsulated components apply to data science and ML?
• Just like we were taught with software, avoid reinventing the wheel!
Pre-TrainedModels(PTMs)
![Page 13: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/13.jpg)
Categories of Pretrained Models (PTMs)
• Computer vision
• Text analysis (NLP, NLU)
• Speech-to-text, Text-to-speech
• Language translation
• Anomaly detection
• Chatbot foundations (e.g. intent mapping)
(Pre)TrainedModels
(Pre)TrainedModels
Pre-TrainedModels(PTMs)
![Page 14: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/14.jpg)
Top Pretrained Model Vendors/Products
![Page 15: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/15.jpg)
Pretrained Models - Computer Vision
• Game changer: Convolutional Neural Networks (CNNs)• Image Labeling, scene detection, object detection
• Face detection, recognition• Facial key-points, emotion, gender, age, etc.
• Text recognition, OCR• Objectionable content detection• Fashion recognition
• Image Style Transfer
• Video analysis
![Page 16: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/16.jpg)
Google Cloud Vision
![Page 17: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/17.jpg)
Google Cloud Vision
![Page 18: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/18.jpg)
Clarifai – Specialized models
![Page 19: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/19.jpg)
AWS Rekognition - Face Metadata
![Page 20: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/20.jpg)
Deep Style Transfer
![Page 21: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/21.jpg)
YOLO
![Page 22: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/22.jpg)
Pretrained Models - Text Analysis• Named Entity Recognition (NER)• Dependency parsers
• Sentiment/Tone Analysis
• Speech-to-text, Text-to-speech• Language Translation• Word embeddings
• Language detection• Summarization
• Content Moderation
Word embedding space
![Page 23: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/23.jpg)
Microsoft Azure Cognitive Services
![Page 24: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/24.jpg)
Google Cloud Natural Language
![Page 25: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/25.jpg)
Sentiment Analysis
![Page 26: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/26.jpg)
Google Grammatical Parsing
![Page 27: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/27.jpg)
IBM Watson Tone Analyzer
"utterance_text" : "Well, nothing is working :(","tones" :
"score" : 0.997149,"tone_id" : "sad","tone_name" : "sad"
"utterance_text" : "Sorry to hear that.","tones" :
"score" : 0.689109,"tone_id" : "polite","tone_name" : "Polite"
"score" : 0.663203,"tone_id" : "sympathetic","tone_name" : "Sympathetic"
"utterance_text" : "Hello, I'm having a problem with your product.","tones" :
"score" : 0.718352,"tone_id" : "polite","tone_name" : "Polite"
Label text with classifications such as: anger, disgust, fear, joy, sadness, analytical, confident, tentative, sad, frustrated, satisfied, excited, polite, impolite, and sympathetic.
![Page 28: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/28.jpg)
How Do Competing Offerings Compare?
• Hard not to notice, many of the big guys (and little guys) are
competing with very similar offerings.
• Obvious question: which one should I choose for my application?
• How about using multiple offerings at once?
• We built a simple empirical comparison tool to look at model
performances side-by-side
![Page 29: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/29.jpg)
Image OCR Comparisons
![Page 30: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/30.jpg)
Image OCR Comparisons
![Page 31: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/31.jpg)
Image OCR Comparisons
![Page 32: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/32.jpg)
Content Moderation
![Page 33: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/33.jpg)
Sentiment Analysis Comparisons
![Page 34: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/34.jpg)
Speech-to-Text Comparisons
![Page 35: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/35.jpg)
More Speech-to-Text Comparisons
![Page 36: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/36.jpg)
Language Translations Comparisons
![Page 37: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/37.jpg)
Face Detection Comparisons
![Page 38: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/38.jpg)
Face Detection Comparisons
![Page 39: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/39.jpg)
Case Studies
• User product reviews over voice• Sentiment tied to keyword tagging
• Chat communications conversation analysis• Sentiment, emotional analysis of conversations
• Home décor design• Pretrained CNNs power furniture image similarity engine
• Text recognition, OCR• Use paper (EOBs, transcripts, bills) as input to app
• Meeting analysis• speech-to-text
![Page 40: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/40.jpg)
Challenges with PTMs
• Lack of interpretability of DNNs – Black boxes
• Measuring accuracy – How good are these models?
• Report N-fold cross validation metrics?
• Establish cross-vendor standardized test sets
• Measuring Cultural Bias
• E.g. Face detection
• CNNs can be vulnerable to so-called adversarial examples
![Page 41: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/41.jpg)
Transfer learning
• Idea: Leverage the ‘layered’ architectures of DNNs
• Finetune an existing PTM for a specific classification task
• Typically only requires a small amount of training data (yay!)
• Right now works great for CNNs, but can it work for text?
• Google ‘AutoML Vision’ and MSFT’s “Custom Vision” Service
![Page 42: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/42.jpg)
Simple CNN
classification
![Page 43: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/43.jpg)
Simple CNN
Freeze Retrain
classification
![Page 44: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/44.jpg)
Conclusions
• Data scientists fear not: Manual machine learning model building is not going away
• Question: What is the depth of product use cases PTMs can support/satisfy?
• As devs catch on, will there be demand for more specific/granular models?
• What about other types of common machine learning problems can be generalized and
benefit from customizable base models (e.g. churn)?
• How can we overcome the challenges with PTMs?
• Make your apps smarter - Try some of these PTMs out!
![Page 45: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/45.jpg)
Thank you!Jay Bartot – CTO, Madrona Venture Labs
![Page 46: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/46.jpg)
Appendix
![Page 47: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/47.jpg)
Google’s AutoML
• Machine learning models are often painstakingly designed by a team
of engineers and scientists
• Manually designing machine learning models is difficult because the
search space of all possible models can be combinatorically large
• AutoML: Automate the design of machine learning models
• Transfer learning capabilities
![Page 48: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/48.jpg)
GoogleNet Architecture
![Page 49: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/49.jpg)
High-level machine learning steps
![Page 50: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/50.jpg)
VGG Architecture
![Page 51: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/51.jpg)
My First PTM - Viola-Jones Face Detector
• A seminal approach to real-time face (and object) detection
• Training is slow but detection is very fast
• Key ideas:• Integral images for fast feature computation
• AdaBoost for feature selection
• Attentional cascade for fast rejection of non-face windows
• C/C++ version added to OpenCV in 2008(?)
![Page 52: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/52.jpg)
Deep Dream Image
![Page 53: Democratizing access to Machine Learning with Pre-Trained ......IBM Watson Studio Paperspace H20 - DriverlessAI. But there is a problem. ... • Just like we were taught with software,](https://reader035.vdocument.in/reader035/viewer/2022070821/5f1f8f588a71d777c41b1c2f/html5/thumbnails/53.jpg)
Machine Learning Appliances
• https://techcrunch.com/2017/07/06/h2o-ais-driverless-ai-automates-machine-learning-for-businesses/
• https://ai.googleblog.com/2017/05/using-machine-learning-to-explore.html