computer vision an overview - public.ostfalia.de

14.5.2019Computer Vision – An OverviewJonny Karlsson B. Eng. (IT), PhD

Degree Programme Director / Senior Lecturer in Information Technologhy

arcada.fi #arcadauas

ArcadaUniversity ofAppliedSciences

Arcada University of Applied SciencesCampus in Arabianranta, Helsinki

• A modern university of appliedsciences in Helsinki, Finland

• 2700 students

• Modern campus withinternational atmosphere;

• more than 40 nationalitiesare represented

• more than 10% of all students come from abroad

• Business Administration

• Cultural Management

• Media

• Environmental and

Energy Engineering

• Materials Processing Technology

• Information Technology• Sports and Health Promotion

• Occupational Therapy

• Physiotherapy

• Social Services

• Emergency Care

• Public Health

• Midwifery

• Nursing

BACHELOR’S DEGREE PROGRAMMES

IN SWEDISH

• International Business

• Materials Processing

Technology

• Nursing

BACHELOR’S DEGREE PROGRAMMES

IN ENGLISH

MASTER’S DEGREE PROGRAMMES

• Big Data Analytics

• International Business

Management

• Media Management

• Global Health Care

• Rehabilitation (in Swedish)

• Advanced Clinical Care

(in Swedish)

• Health promotion (in Swedish)

• Social Services (in Swedish)

Whatwillwecover today?

1. What is Computer Vision?

2. Low-level vision

– Smoothing (removing noise)

– Sobel edge detection

3. Middle-level vision

– Separating foreground objects from the background

4. High-level vision

– Face/object recognition with the Viola-Jones algorithm

• Input: Image or video sequence

• Output: Some description/interpretation of the image or video– Many different possible levels of description/interpretation.

Whatis Computer Vision?

Computer Vision Applications

• Security– Face recognition

• Medical imaging– Reconstruction/visualization of inner body

– Diagnostisation

• Autonomous cars

• E-Rehabilitation

Computer Vision Activitiesat Arcada

• Teaching

– A course in computer vision for 3rd year IT students (bachelor)

– Exchange students from Ostfalia have participated in planning the course content!

Nils-Peter Töpfer (7.2.2017). Development of a modern course for the information

analytics curriculum: Programming exercises for image processing and computer vision

algorithms

Karola Tabea Isensee (7.2.2017). Development of a modern course for the information

analytics curriculum: Theory and state-of-the-art analysis of computer vision

• A student project from the computer vision course

• Research

– Computer vision based real-time motion analysis in health and well-being.

Elements ofComputer Vision

Three Stagesin Computer Vision

1. Low-level

– Input: Image Output: Image

2. Middle-level

– Input: Image Output: Features

3. High-level

– Input: Image Output: Recognition/Interpret.

Looks like an edge!

1. Low-levelVision

Considers local image properties

1. Low-levelVision: Example

Blur (smoothing) filter

Sharpening filter

1. Low-levelVision: Example

Edge detection filter

2. Middle-levelVision

Segmentation and the grouping of pixels/features

Foreground object separated from background

Low-to-MiddleExample

Edge Detection

Object Recognition

Middle-level

original edge image

Circular arcs, line

segments, corners

Structures in the data

3. High-levelVision

Interpretation/Recognition

It’s a chair!

High-to-Low-levelVision: Example

edge image

consistentlines and corners

Low-level

Middle-level

High-level

Building Recognition

Low-levelVision: Whatis Image Filtering?

• Some common image filtering techniques:

– Low-pass filters (smoothing)

– Edge detection

Low-levelVision: Howdoesfilteringwork?

• Let’s first make sure we all know what a digital image is

Whatis a digital image??

• Can be seen as a matrix of pixels

• Each pixel represents a color value

• In a grayscale image the color value– 0 means black

– 255 means white

Digital Images: Color Channels

• In a color image each pixel represents a color value for threedifferent color channels:

– (R)ed: 0255

– (G)reen: 0255

– (B)blue: 0 255

Digital Images: Color Channels

Computer Vision algorithms typically operate on grayscaleimages

Easier!!

Low-levelVision: Image Filtering

• Modifies pixels by applying an operator to a localised

neighbourhood of pixels

• So the output pixel value (8) depends upon the corresponding input

pixel (7) and its neighbours

Local image

Modified image

Some operator

Image Filtering–Pixel Mask / Kernel

• Splits an image into smaller sub-images for processing

• Operations only apply to masked pixels

• Operations are bit-wise

• A pixelmask/kernel is typically a square (3x3, 5x5..) with a center

Image Filtering–ExamplePixel Mask

Mask #1

00p 01p

Mask #N

Mask moves one pixel right

Image Filtering–Bit-wiseOperations

84 107

161 0 1

Mask Weights

07 136

44 7308

jyixhjifyxhyxfyxg ,,,,,

Output

Dot Product

Image Filtering

• As a result of filtering, new pixels are obtained by a dot product

(sum of bit-wise multiplies) over the mask

• Changing the mask coefficients (the numbers in the mask) gives

different filter functions

Simple Filter Operations

Original

3x3 mask

Original 3x3 mask Filtered (no change)

Original

3x3 mask

Original 3x3 mask Shifted left 1 pixel

Low-Pass Filtering(Smoothing)

Averaging Filter

Original

Blurred (effect of

averaging)3x3 mask

Low-Pass Filtering: AveragingFilter

84 107

161 1 1

07 136

44 7308

OutputSum of Bitwise X / 9

Averagingfilter –Simple Example

1 1 100 1 1

• Lets apply an averaging filter on the following

Image!

1 1 100 1 1

1 34 100 1 1

1 1 100 1 1

1 34 34 1 1

1 1 100 1 1

1 34 34 34 1

1 1 100 1 1

1 34 34 34 1

Finally, after applying the mask on

the whole image!

Image Filtering: EdgeDetection

1 0 -1

2 0 -2

1 0 -1

70 70 70 70 70 10 10 10

Input image

Vertical Sobel mask/filter

• The Sobel filter is well known in computer vision for recreating images

emphasising the edges

1 0 -1

2 0 -2

1 0 -1

70 70 70 70 70 10 10 10

Input imageVertical Sobel mask

Output image (edge map)

70 70 70 70 70 10 10 10

70 0 70 70 70 10 10 10

70 70 70 70 70 10 10 10

1 0 -1

2 0 -2

1 0 -1

70 70 70 70 70 10 10 10

70 0 0 0 240 10 10 10

70 70 70 70 70 10 10 10

1 0 -1

2 0 -2

1 0 -1

70 70 70 70 70 10 10 10

70 0 0 0 240 240 10 10

70 70 70 70 70 10 10 10

1 0 -1

2 0 -2

1 0 -1

70 70 70 70 70 10 10 10

70 0 0 0 240 240 0 10

70 70 70 70 70 10 10 10

1 0 -1

2 0 -2

1 0 -1

70 70 70 70 70 10 10 10

0 0 0 0 240 240 0 0

0 0 -0

-1 -2 -1

Horizontal Sobel mask/filter

• Note that to be able to find horizontal masks we need to apply a

horizontal sobel mask/filter

• So, the result of the edge detection process is a so called “binary

image” where the edges have been highlighted

Edge Detection

original edge image / edge map

So far wehavecovered

1. Low-level

2. Middle-level

3. High-level

Middle-levelVision: Separation offoregroundobjectsfrom the background

• We will not go in to this in detail, but in short:

Object Recognition Circular

arcs, line segments,

cornersStructures in the data

1. Low-level

2. Middle-level

3. High-level

High-levelVision example: Face detection

• The Viola Jones algorithm is a classic and widely used algorithm for

face detection but also for detecting other different types of objects

• Viola-Jones consists of 4 different ”elements”:

Haar features

Integral image

Adaboost

Cascading

The Viola-Jones AlgorithmHaar features | Integral image | Adaboost | Cascading

• The problem with Haar featrues is that we need to

calculate the average of a given region multiple times

=> High complexity! ( O(N2) )

• Why?

We have to use Haar features with all possible

sizes and locations

Same mask, but different sizes!!

• In an integral image the value at pixel (x, y) is the sum

of pixels above and to the left of (x, y)

Input image Integral image

• Much lower complexity!! ( O(1) )

• With a suitable scale of the Haar line feature placed as

in the image below we would get a quite high match!

• The problem is that with the same Haar feature we

would also get matches in other places of the image!

• One feature alone, or in other words, a single classifier

is not accurate enought!

Called a weak classifier

The idea of Haar-cascading is to combine a series

of weak classifiers (those who are barely better

than ”random guessing”) to achieve a strong

classifier!

• Each of the figures below represent a general feature

of a face (Believe it or not!)

• If we combine all these features toghether… see the

point?

• Ok, this is a simplified example, but this combination of Haar features is

unlikely to be found elsewhere than in a face!

• We want to find a series of weak classifiers that all matches in the reagion

of a face if you pass them in a certain order.

• If any classifier fails, then everyting fails! => not a face!

Haar Feature 1 Haar Feature 2 Haar Feature 3Pass Pass

Fail Fail Fail

• How do we combine a series of weak classifiers into a strong

classifier?

Adaboost tries out multiple weak classifiers over several rounds

The best weak classifier in each round is selected

Finally the best weak classifiers are combined

• In other words the Viola-Jones face detection algorithm is based on

machine learning!

Before we can detect a face we need to train the algorithm to find the

”strong classifier”

The training phase is done by testing all the weak classifiers on a set of

both positive (images with containg a face) and negative (images not

containing a face) training data.

The Viola-Jones Algorithm: Summary

1. Low-level

2. Middle-level

3. High-level

Nowyouknowall aboutComputer Vision!

No youdon’t!!!!• BUT hopefully you have got an insight into the different levels of computer

vision and you have seen some examples of operations at each level:

Low-level vision

o Blurring, sharpening, edge detection

Middle-level vision

o Extracting features / separating foreground objects from the background

High-level vision

o Recognition and interpretation, e.g. face detection

Summary

• Typically computer vision software involve all 3 levels, for example:

1. Apply sharpening filter for strengthening the edges and Sobel edge

detection filter for emphasising the edges

2. Extract the foreground objects from the background, e.g. find contours,

lines, corners etc. Output typically a binary image (such as an edge

3. On the binary image apply some algorithms for performing

interpretation of the image or recognizing a specific object in the image

Thank you for listening!

jonny.karlsson@arcada.fi

computer vision an overview - public.ostfalia.de

Documents

6.891 vision why study computer vision? why study computer

1 opencv- an overview intel ® open source computer vision...

a brief overview of computer vision

carlo tomasi, computer science. human vision computer...

computer graphics & computer vision

computer vision fundamentals of computer vision - mubarak...

cs 231a section: computer vision libraries...

tsbb15 computer vision - linköping university · january...

objectives & credits computer vision overview what is ... -...

computer vision @

20151216 computer vision meetup deep learning deep learning...

computer vision – overview hanyang university jong-il park

overview of computer vision cs491e/791e. what is computer...

arduino computer vision programming graphics bundle ·...

computer vision cmput 428/615 lecture 1: introduction and...

quick overview of robotics and computer vision. computer...

overview of human and computer vision

computer vision cmput 499 lecture 1: introduction and course...

cs 231a section: computer vision libraries overview 231a...

computer vision fundamentals of computer vision - mubarak...