Democratizing access to Machine Learning with Pre-Trained Models
and Cloud ServicesJay Bartot – CTO, Madrona Venture Labs
Who Am I?• CTO of Madrona Venture Labs
• Serial Technology Entrepreneur
• Machine Learning Enthusiast
• Started learning about ML in 2001
• Lots of startups and acquisitions over last 20 years
• Classification, Forecasting, Text-mining, Computer vision
• eCommerce, Online Advertising, Travel, Medical informatics, Consumer video
What I work on
• We look for technology startup ideas that are aligned with Madrona
Venture Group’s investment thesis
• Our critical work: Carefully vetting the ideas for market, customer,
technology fit
• Most ideas ultimately don’t pass muster – so we kill them!
• Main focus: Vertical AI/ML companies
Data science (DS) and Machine learning (ML) Landscape
Many once extremely challenging problems now have robust solutions due to:
• Big data
• Compute power
• Reemergence of neural network learning algorithms
DS/ML Landscape
• Explosion of open source toolkits and platforms – phenomenal!
• Amazing how quickly new ML techniques and tools become commoditized
• Lots of new educational material and resources
• But DS/ML is still complex with significant learning curve
DS/ML Landscape
• Big learning curve
• Somewhat of a black art• Critical work:
• Data cleaning, normalizing • Feature engineering• Hyperparameter tuning• Testing/Validation
• Hosting models for inference
• Time consuming and expensive!
Machine learning pipeline – Phase 1Data Prep
Machine learning pipeline – Phase 2Model Training
Machine learning pipeline – Phase 3Inference
Full featured DS/ML Cloud platforms
AMZ SageMaker
Google ML Engine
Azure ML Studio
IBM Watson Studio
Paperspace H20 - DriverlessAI
But there is a problem
Democratizing DS/ML: Empowering the Rest of Us
• Proliferation of pretrained models and services
• From files/libraries to cloud services
• Can the paradigm of software reusability and
encapsulated components apply to data science and ML?
• Just like we were taught with software, avoid reinventing the wheel!
Pre-TrainedModels(PTMs)
Categories of Pretrained Models (PTMs)
• Computer vision
• Text analysis (NLP, NLU)
• Speech-to-text, Text-to-speech
• Language translation
• Anomaly detection
• Chatbot foundations (e.g. intent mapping)
(Pre)TrainedModels
(Pre)TrainedModels
Pre-TrainedModels(PTMs)
Top Pretrained Model Vendors/Products
Pretrained Models - Computer Vision
• Game changer: Convolutional Neural Networks (CNNs)• Image Labeling, scene detection, object detection
• Face detection, recognition• Facial key-points, emotion, gender, age, etc.
• Text recognition, OCR• Objectionable content detection• Fashion recognition
• Image Style Transfer
• Video analysis
Google Cloud Vision
Google Cloud Vision
Clarifai – Specialized models
AWS Rekognition - Face Metadata
Deep Style Transfer
YOLO
Pretrained Models - Text Analysis• Named Entity Recognition (NER)• Dependency parsers
• Sentiment/Tone Analysis
• Speech-to-text, Text-to-speech• Language Translation• Word embeddings
• Language detection• Summarization
• Content Moderation
Word embedding space
Microsoft Azure Cognitive Services
Google Cloud Natural Language
Sentiment Analysis
Google Grammatical Parsing
IBM Watson Tone Analyzer
"utterance_text" : "Well, nothing is working :(","tones" :
"score" : 0.997149,"tone_id" : "sad","tone_name" : "sad"
"utterance_text" : "Sorry to hear that.","tones" :
"score" : 0.689109,"tone_id" : "polite","tone_name" : "Polite"
"score" : 0.663203,"tone_id" : "sympathetic","tone_name" : "Sympathetic"
"utterance_text" : "Hello, I'm having a problem with your product.","tones" :
"score" : 0.718352,"tone_id" : "polite","tone_name" : "Polite"
Label text with classifications such as: anger, disgust, fear, joy, sadness, analytical, confident, tentative, sad, frustrated, satisfied, excited, polite, impolite, and sympathetic.
How Do Competing Offerings Compare?
• Hard not to notice, many of the big guys (and little guys) are
competing with very similar offerings.
• Obvious question: which one should I choose for my application?
• How about using multiple offerings at once?
• We built a simple empirical comparison tool to look at model
performances side-by-side
Image OCR Comparisons
Image OCR Comparisons
Image OCR Comparisons
Content Moderation
Sentiment Analysis Comparisons
Speech-to-Text Comparisons
More Speech-to-Text Comparisons
Language Translations Comparisons
Face Detection Comparisons
Face Detection Comparisons
Case Studies
• User product reviews over voice• Sentiment tied to keyword tagging
• Chat communications conversation analysis• Sentiment, emotional analysis of conversations
• Home décor design• Pretrained CNNs power furniture image similarity engine
• Text recognition, OCR• Use paper (EOBs, transcripts, bills) as input to app
• Meeting analysis• speech-to-text
Challenges with PTMs
• Lack of interpretability of DNNs – Black boxes
• Measuring accuracy – How good are these models?
• Report N-fold cross validation metrics?
• Establish cross-vendor standardized test sets
• Measuring Cultural Bias
• E.g. Face detection
• CNNs can be vulnerable to so-called adversarial examples
Transfer learning
• Idea: Leverage the ‘layered’ architectures of DNNs
• Finetune an existing PTM for a specific classification task
• Typically only requires a small amount of training data (yay!)
• Right now works great for CNNs, but can it work for text?
• Google ‘AutoML Vision’ and MSFT’s “Custom Vision” Service
Simple CNN
classification
Simple CNN
Freeze Retrain
classification
Conclusions
• Data scientists fear not: Manual machine learning model building is not going away
• Question: What is the depth of product use cases PTMs can support/satisfy?
• As devs catch on, will there be demand for more specific/granular models?
• What about other types of common machine learning problems can be generalized and
benefit from customizable base models (e.g. churn)?
• How can we overcome the challenges with PTMs?
• Make your apps smarter - Try some of these PTMs out!
Thank you!Jay Bartot – CTO, Madrona Venture Labs
Appendix
Google’s AutoML
• Machine learning models are often painstakingly designed by a team
of engineers and scientists
• Manually designing machine learning models is difficult because the
search space of all possible models can be combinatorically large
• AutoML: Automate the design of machine learning models
• Transfer learning capabilities
GoogleNet Architecture
High-level machine learning steps
VGG Architecture
My First PTM - Viola-Jones Face Detector
• A seminal approach to real-time face (and object) detection
• Training is slow but detection is very fast
• Key ideas:• Integral images for fast feature computation
• AdaBoost for feature selection
• Attentional cascade for fast rejection of non-face windows
• C/C++ version added to OpenCV in 2008(?)
Deep Dream Image
Machine Learning Appliances
• https://techcrunch.com/2017/07/06/h2o-ais-driverless-ai-automates-machine-learning-for-businesses/
• https://ai.googleblog.com/2017/05/using-machine-learning-to-explore.html