IBM WatsonIntroduction + Visual RecognitionComputer Vision Meetup, Vienna, 26.03.2017
Florian RosenbergSenior Technical Staff Member
Manuel FuchsCognitive Computing
Consultant
© 2017 International Business Machines Corporation2
1997Deep Blue
2011Jeopardy!
2014NY Genome Center
IBM is pioneer of intelligent and cognitive systems
Learning from experience Understanding naturallanguage and provide exact
answers to complex questions
Analyze big data volumes toassist physicians in medical
reasearch and cancerdiagnosis
© 2017 International Business Machines Corporation3
The Watson Platform is Growing
• Watson’s APIs are the cognitive building blocks that harness our data.
ConversationSpeech to Text Text to Speech Retrieve and RankDiscovery Service Tone Analyzer Natural Language Classifier Language Translator
Language Detection Tradeoff AnalyticsPersonality Insights Document Conversion Visual Recognition AlchemyData NewsAlchemyAPI
v
Retrieve and Rank
Natural Language Classifier
Tone Analyzer
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Unlocking Data in Plain Sight with Visual Recognition
Florian Rosenberg
@ponifaax
Senior Technical Staff Member, IBM
Place
Your
Picture
here
April 25, 2017
All credits go to Matthew Hill (Sr. Software Engineer, IBM Research) for letting me use his slides from the IBM Technical Leadership Exchange Meeting.
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Objectives
By the end of this session, you should understand:– The capabilities offered by Visual Recognition– What it is not designed to do– Different ways of measuring “accuracy”– What it has in common with a weather forecast– The basics of how Visual Recognition works
5
© 2015 IBM Corporation © Copyright IBM Corporation 2017 6
Introduction
Visual Recognition brings structure to data in images– People communicate naturally with images
• But we need machine learning for software to understand images
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Visual Recognition Services1. General Tagging
• What in the world is this image?2. Custom classification
• Based on labelled examples I gave, what is this image?3. Similarity Search
• Which images from those I indexed are most similar to this image?
4. Face detection• Where are the faces in this image?
© 2015 IBM Corporation © Copyright IBM Corporation 2017 8
• Our general model works well on “consumer photo” types of images
• See our blog post here
• The output is a set of tags with a score for each between 0 and 1
• Tens of thousands of visual tags• It applies some hierarchical reasoning,
such as a Chihuahua “is a” dog when computing scores
Capability: General Tagging – What in the world is this image?
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Capability: Custom Training:-- Based on examples I gave, what is this image?
“You Teach Watson”9
CustomVisual
Learning
Recognition
Unknownimages
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Custom Training, Classification and Retraining
10
Demohttps://visual-recognition-demo.mybluemix.net/train
© 2015 IBM Corporation © Copyright IBM Corporation 2017 11
Capability: Similarity Search (beta)– Which images that I indexed are most similar to this image?
1. Create index2. Search by example
• Similarity is “baked-in”• Customizable to your collection
• The output is a list of images ranked by similarity
• Demo here: • https://similarity-search-
demo.mybluemix.net/
© 2015 IBM Corporation © Copyright IBM Corporation 2017 12
Capability: Face Detection – Where are the faces?
• Faces are a special class of things in machine vision
• Visual Recognition provides• detection (where),• not recognition (who)
• The output is a bounding box for each face
© 2015 IBM Corporation © Copyright IBM Corporation 2017
What is outside the scope of Visual Recognition?
§ For now: – Face Recognition– Scene Text / OCR– Object Localization– Emotion Analysis
– Some of these can be tackled with custom classifiers, but your results may vary widely with the context
13
© 2015 IBM Corporation © Copyright IBM Corporation 2017
How “Accurate” is it?
§ What is this image?
§ What is the gold standard?– In academic papers, top-1 and top-5 accuracy (typically of 1000 classes)– For actual developers?
14
© 2015 IBM Corporation © Copyright IBM Corporation 2017
How we validate the general tagging model
§ Which set of tags do you prefer?
§A: living room, table, indoors
§B: couch, fireplace, chair, dog, carpet, window
§ A vs B test for tag sets, user preference – AMT or experts?
15
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Measuring Quality in Classification Systems
16
§ I’m looking for Beagles in this set!– System gives me a score for each image– User picks a decision threshold
§ Precision– Of the all images above my threshold,– What fraction are really beagles?
§ Recall– Of all the actual beagles in the set,– What fraction scored above the threshold?
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Measuring Quality in Classification Systems
§ Is it more important to be right or to be not wrong?– TSA airport screening (no false negatives)
• Vs– Red light camera (false positives are costly, people complain)
– Car alarms (owner wants – no false negs, neighbor wants: no false positives)– Medical Tests: Diagnostic vs. Screening
• Aka Sensitivity and Specificity
17
© 2015 IBM Corporation © Copyright IBM Corporation 2017
What do I get as an “answer” from a custom classifier?
18
{"images": [{
"image": "woof.jpg","classifiers": [{"classifier_id": “Breeds_785901324",
"name": “Breeds","classes": [{"class": “Beagle","score": 0.2501731}, {"class": “Retriever","score": 0.98246
}]}]
}]}
© 2015 IBM Corporation © Copyright IBM Corporation 2017
What does Visual Reco have in common with a weather forecast?
19
Think about the action you will take (or not take) based on the information provided.
• What does a degree (Celsius or Fahrenheit) mean?• What is your decision threshold for a classifier score?
• Details in our docs
© 2015 IBM Corporation © Copyright IBM Corporation 2017
How does it work?
20
IBM Research has had a long-running interest in image understanding.What is new about this?
• Watson Developer Cloud platform• Deep learning and convolutional neural nets• Trained on tens of millions of images to extract great “features”• Used as a basis for learning new labels from just a few dozen examples
• Want more details? For IBMers, our code is all in GitHub!• Contact: [email protected] or @ponifaax on Twitter
© 2015 IBM Corporation © Copyright IBM Corporation 2017
Example App Built on VR API: DarkVision Video Analysis
21
© 2015 IBM Corporation © Copyright IBM Corporation 2017 22
Hands On – DIY Resources
API Walkthroughs from World of Watson – a .zip file with README and example images inside for each
Custom Classifier Training -https://ibm.box.com/v/wow-vr
Similarity Search -https://ibm.box.com/v/simsearch
© 2015 IBM Corporation © Copyright IBM Corporation 2017 23
Conclusion
We have outlined:– The capabilities offered by Visual Recognition– What it is not designed to do– Different ways of measuring “accuracy”– The basics of how Visual Recognition works
– But don’t take my word for it - Take it from a startup! a nice blog: • https://blog.recast.ai/image-recognition-api/
© 2015 IBM Corporation © Copyright IBM Corporation 2017
…any final questions?
24
CONFIDENTIAL
What will you do with Watson?