computer vision an overview - public.ostfalia.de
Post on 13-Nov-2021
5 Views
Preview:
TRANSCRIPT
14.5.2019Computer Vision – An OverviewJonny Karlsson B. Eng. (IT), PhD
Degree Programme Director / Senior Lecturer in Information Technologhy
Arcada University of Applied SciencesCampus in Arabianranta, Helsinki
• A modern university of appliedsciences in Helsinki, Finland
• 2700 students
• Modern campus withinternational atmosphere;
• more than 40 nationalitiesare represented
• more than 10% of all students come from abroad
• Business Administration
• Cultural Management
• Media
• Environmental and
Energy Engineering
• Materials Processing Technology
• Information Technology• Sports and Health Promotion
• Occupational Therapy
• Physiotherapy
• Social Services
• Emergency Care
• Public Health
• Midwifery
• Nursing
BACHELOR’S DEGREE PROGRAMMES
IN SWEDISH
• International Business
• Materials Processing
Technology
• Nursing
BACHELOR’S DEGREE PROGRAMMES
IN ENGLISH
MASTER’S DEGREE PROGRAMMES
• Big Data Analytics
• International Business
Management
• Media Management
• Global Health Care
• Rehabilitation (in Swedish)
• Advanced Clinical Care
(in Swedish)
• Health promotion (in Swedish)
• Social Services (in Swedish)
arcada.fi #arcadauas
Whatwillwecover today?
1. What is Computer Vision?
2. Low-level vision
– Smoothing (removing noise)
– Sobel edge detection
3. Middle-level vision
– Separating foreground objects from the background
4. High-level vision
– Face/object recognition with the Viola-Jones algorithm
• Input: Image or video sequence
• Output: Some description/interpretation of the image or video– Many different possible levels of description/interpretation.
Whatis Computer Vision?
arcada.fi #arcadauas
Computer Vision Applications
• Security– Face recognition
• Medical imaging– Reconstruction/visualization of inner body
– Diagnostisation
• Autonomous cars
• E-Rehabilitation
arcada.fi #arcadauas
Computer Vision Activitiesat Arcada
• Teaching
– A course in computer vision for 3rd year IT students (bachelor)
– Exchange students from Ostfalia have participated in planning the course content!
Nils-Peter Töpfer (7.2.2017). Development of a modern course for the information
analytics curriculum: Programming exercises for image processing and computer vision
algorithms
Karola Tabea Isensee (7.2.2017). Development of a modern course for the information
analytics curriculum: Theory and state-of-the-art analysis of computer vision
arcada.fi #arcadauas
Computer Vision Activitiesat Arcada
• A student project from the computer vision course
arcada.fi #arcadauas
Computer Vision Activitiesat Arcada
• Research
– Computer vision based real-time motion analysis in health and well-being.
arcada.fi #arcadauas
Three Stagesin Computer Vision
1. Low-level
– Input: Image Output: Image
2. Middle-level
– Input: Image Output: Features
3. High-level
– Input: Image Output: Recognition/Interpret.
arcada.fi #arcadauas
2. Middle-levelVision
Segmentation and the grouping of pixels/features
Foreground object separated from background
arcada.fi #arcadauas
Low-to-MiddleExample
Edge Detection
Object Recognition
Middle-level
original edge image
Circular arcs, line
segments, corners
Structures in the data
arcada.fi #arcadauas
High-to-Low-levelVision: Example
edge image
consistentlines and corners
Low-level
Middle-level
High-level
Building Recognition
arcada.fi #arcadauas
Low-levelVision: Whatis Image Filtering?
• Some common image filtering techniques:
– Low-pass filters (smoothing)
– Edge detection
arcada.fi #arcadauas
Low-levelVision: Howdoesfilteringwork?
• Let’s first make sure we all know what a digital image is
arcada.fi #arcadauas
Whatis a digital image??
• Can be seen as a matrix of pixels
• Each pixel represents a color value
• In a grayscale image the color value– 0 means black
– 255 means white
arcada.fi #arcadauas
Digital Images: Color Channels
• In a color image each pixel represents a color value for threedifferent color channels:
– (R)ed: 0255
– (G)reen: 0255
– (B)blue: 0 255
arcada.fi #arcadauas
Digital Images: Color Channels
Computer Vision algorithms typically operate on grayscaleimages
Easier!!
arcada.fi #arcadauas
Low-levelVision: Image Filtering
• Modifies pixels by applying an operator to a localised
neighbourhood of pixels
• So the output pixel value (8) depends upon the corresponding input
pixel (7) and its neighbours
7 04
1 31
5 36
Local image
8
Modified image
Some operator
arcada.fi #arcadauas
Image Filtering–Pixel Mask / Kernel
• Splits an image into smaller sub-images for processing
• Operations only apply to masked pixels
• Operations are bit-wise
• A pixelmask/kernel is typically a square (3x3, 5x5..) with a center
pixel
arcada.fi #arcadauas
Image Filtering–ExamplePixel Mask
Mask #1
00p 01p
10p
20p
11p
21p
02p
12p
22p
Mask #N
y
x
Mask moves one pixel right
arcada.fi #arcadauas
Image Filtering–Bit-wiseOperations
37 33
61 65
66
62
6862
120
35
28
123
54
23
23
23
84 107
77
107
74
02
03
88
94
77
161 0 1
1
1
0
0 1
1
Mask Weights
9908
114
17
56
83
07 136
15 76
44 7308
3240
Input
i j
jyixhjifyxhyxfyxg ,,,,,
86
Output
Dot Product
X+
arcada.fi #arcadauas
Image Filtering
• As a result of filtering, new pixels are obtained by a dot product
(sum of bit-wise multiplies) over the mask
• Changing the mask coefficients (the numbers in the mask) gives
different filter functions
arcada.fi #arcadauas
Low-Pass Filtering(Smoothing)
Averaging Filter
Original
111
111
111
Blurred (effect of
averaging)3x3 mask
arcada.fi #arcadauas
Low-Pass Filtering: AveragingFilter
37 33
61 65
66
62
6862
120
35
28
123
54
23
23
23
84 107
77
107
74
02
03
88
94
77
161 1 1
1
1
1
1 1
1
Mask
9908
114
17
56
83
07 136
15 76
44 7308
3240
Input
63
OutputSum of Bitwise X / 9
arcada.fi #arcadauas
Averagingfilter –Simple Example
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
• Lets apply an averaging filter on the following
Image!
111
111
111
arcada.fi #arcadauas
Averagingfilter –Simple Example
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
111
111
111
1 1 100 1 1
1 34 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
arcada.fi #arcadauas
Averagingfilter –Simple Example
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
111
111
111
1 1 100 1 1
1 34 34 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
arcada.fi #arcadauas
Averagingfilter –Simple Example
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
111
111
111
1 1 100 1 1
1 34 34 34 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
arcada.fi #arcadauas
Averagingfilter –Simple Example
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
1 1 100 1 1
111
111
111
1 34 34 34 1
1 34 34 34 1
1 34 34 34 1
1 34 34 34 1
1 34 34 34 1
Finally, after applying the mask on
the whole image!
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 0 -1
2 0 -2
1 0 -1
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
*
Input image
Vertical Sobel mask/filter
• The Sobel filter is well known in computer vision for recreating images
emphasising the edges
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 0 -1
2 0 -2
1 0 -1
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
*
Input imageVertical Sobel mask
Output image (edge map)
70 70 70 70 70 10 10 10
70 0 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 0 -1
2 0 -2
1 0 -1
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
*
Input imageVertical Sobel mask
Output image (edge map)
70 70 70 70 70 10 10 10
70 0 0 0 240 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 0 -1
2 0 -2
1 0 -1
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
*
Input imageVertical Sobel mask
Output image (edge map)
70 70 70 70 70 10 10 10
70 0 0 0 240 240 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 0 -1
2 0 -2
1 0 -1
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
*
Input imageVertical Sobel mask
Output image (edge map)
70 70 70 70 70 10 10 10
70 0 0 0 240 240 0 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 0 -1
2 0 -2
1 0 -1
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
70 70 70 70 70 10 10 10
*
Input imageVertical Sobel mask
Output image (edge map)
0 0 0 0 240 240 0 0
0 0 0 0 240 240 0 0
0 0 0 0 240 240 0 0
0 0 0 0 240 240 0 0
0 0 0 0 240 240 0 0
arcada.fi #arcadauas
Image Filtering: EdgeDetection
1 2 1
0 0 -0
-1 -2 -1
Horizontal Sobel mask/filter
• Note that to be able to find horizontal masks we need to apply a
horizontal sobel mask/filter
arcada.fi #arcadauas
Image Filtering: EdgeDetection
• So, the result of the edge detection process is a so called “binary
image” where the edges have been highlighted
Edge Detection
original edge image / edge map
arcada.fi #arcadauas
So far wehavecovered
1. Low-level
– Input: Image Output: Image
2. Middle-level
– Input: Image Output: Features
3. High-level
– Input: Image Output: Recognition/Interpret.
arcada.fi #arcadauas
Middle-levelVision: Separation offoregroundobjectsfrom the background
• We will not go in to this in detail, but in short:
Object Recognition Circular
arcs, line segments,
cornersStructures in the data
arcada.fi #arcadauas
So far wehavecovered
1. Low-level
– Input: Image Output: Image
2. Middle-level
– Input: Image Output: Features
3. High-level
– Input: Image Output: Recognition/Interpret.
arcada.fi #arcadauas
High-levelVision example: Face detection
• The Viola Jones algorithm is a classic and widely used algorithm for
face detection but also for detecting other different types of objects
• Viola-Jones consists of 4 different ”elements”:
Haar features
Integral image
Adaboost
Cascading
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• The problem with Haar featrues is that we need to
calculate the average of a given region multiple times
=> High complexity! ( O(N2) )
• Why?
We have to use Haar features with all possible
sizes and locations
Same mask, but different sizes!!
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• In an integral image the value at pixel (x, y) is the sum
of pixels above and to the left of (x, y)
1 1 1
1 1 1
1 1 1
1 2 3
2 4 6
3 6 9
Input image Integral image
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• Much lower complexity!! ( O(1) )
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• With a suitable scale of the Haar line feature placed as
in the image below we would get a quite high match!
• The problem is that with the same Haar feature we
would also get matches in other places of the image!
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• One feature alone, or in other words, a single classifier
is not accurate enought!
Called a weak classifier
The idea of Haar-cascading is to combine a series
of weak classifiers (those who are barely better
than ”random guessing”) to achieve a strong
classifier!
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• Each of the figures below represent a general feature
of a face (Believe it or not!)
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• If we combine all these features toghether… see the
point?
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• Ok, this is a simplified example, but this combination of Haar features is
unlikely to be found elsewhere than in a face!
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• We want to find a series of weak classifiers that all matches in the reagion
of a face if you pass them in a certain order.
• If any classifier fails, then everyting fails! => not a face!
Haar Feature 1 Haar Feature 2 Haar Feature 3Pass Pass
Fail Fail Fail
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• How do we combine a series of weak classifiers into a strong
classifier?
Adaboost tries out multiple weak classifiers over several rounds
The best weak classifier in each round is selected
Finally the best weak classifiers are combined
arcada.fi #arcadauas
The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading
• In other words the Viola-Jones face detection algorithm is based on
machine learning!
Before we can detect a face we need to train the algorithm to find the
”strong classifier”
The training phase is done by testing all the weak classifiers on a set of
both positive (images with containg a face) and negative (images not
containing a face) training data.
arcada.fi #arcadauas
So far wehavecovered
1. Low-level
– Input: Image Output: Image
2. Middle-level
– Input: Image Output: Features
3. High-level
– Input: Image Output: Recognition/Interpret.
arcada.fi #arcadauas
No youdon’t!!!!• BUT hopefully you have got an insight into the different levels of computer
vision and you have seen some examples of operations at each level:
Low-level vision
o Blurring, sharpening, edge detection
Middle-level vision
o Extracting features / separating foreground objects from the background
High-level vision
o Recognition and interpretation, e.g. face detection
arcada.fi #arcadauas
Summary
• Typically computer vision software involve all 3 levels, for example:
1. Apply sharpening filter for strengthening the edges and Sobel edge
detection filter for emphasising the edges
2. Extract the foreground objects from the background, e.g. find contours,
lines, corners etc. Output typically a binary image (such as an edge
map)
3. On the binary image apply some algorithms for performing
interpretation of the image or recognizing a specific object in the image
top related