real time face detection.pptx

8/14/2019 Real Time Face Detection.pptx

1/70

REALTIMEFACEDETECTIONBy :SUMEET SAURAV

Wednesday,November06,

2013

Realtimefacedetection

1


2/70

INTRODUCTION

Face detection has been one of the most activeresearch topic in computer vision over the past

decade.

It is the core of all facial analysis, e.g., face

localization, facial feature detection, face recognition,face authentication, face tracking, and facial

expression recognition.

It is a fundamental technique for other applications

such as content-based image retrieval, video

conferencing, and intelligent human computer

interaction (HCI).


2013

Realtime

facedetection

2


3/70

GOALANDCHALLENGES????

The goal of face detection is to determine whether

or not there are any faces in the image and, if

present, return the location and the extent of each

face.

It is a challenge for computer vision due to the

variations in scale, location, orientation, pose,

facial expression, light condition, and various

appearance features (e.g., presence of glasses,facial hair, makeup, etc.)


2013

Realtime

facedetection

3


4/70

PERFORMANCEMETRICS

Learning timeExecution time

The number of samples required in training, and

the ratio between the detection rate and the false

alarm.

Some common terms related with the face

detection.

False Positive(needs to be minimized)

True Positive(needs to be maximized)

False negative(needs to be minimized)

Wednesday,Nov

ember06,

2013

Realtime

facedetection

4


5/70

DIFFERENTFACEDETECTIONAPPROACHES

Yang classified face detection approaches into four majorcategories. These are

Knowledge-based(depend on a set of rules, based on

human knowledge, to detect faces.)

Feature invariant(locate faces by extracting structuralfeatures of the face using statistical classifier)

Template matching(use predefined or parameterized face

templates to locate and detect faces, by computing the

correlation values between the template and the inputimage).

Appearance-based approaches.(depend on a set of

representative training face images to learn face

models).It shows best performance.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

5


6/70

VIOLAJONESFACEDETECTION

Robustvery high Detection Rate (True-PositiveRate) & very low False-Positive Rate always.

Real TimeFor practical applications at least 2

frames per second must be processed.

Face Detectionnot recognition. The goal is to

distinguish faces from non-faces (face detection

is the first step in the identificationprocess)

There are three key contributions: Introduction of Integral Image.

Simple and efficient classifier.

Cascading of classifiers.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

6


7/70

INTEGRALIMAGE

A new image representation that allows for veryfast feature evaluation.

A set of features which are reminiscent of Haarbasis functions are used for the face detection.

To compute these features at many scales integralimage is used.

Similar to summed area table used in computergraphics.

Can be computed using few operations per pixels.Once computed all the haar features can be

calculated at any location or at any scale inconstant time.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

7


8/70

CLASSIFIER

The second contribution of the viola Jones face

detection framework is the introduction of simpleand efficient classifier based on the Adaboost

algorithm.

The classifier selects the small number of importantfeatures from the pool of haar features(nearly

16000!!) within any sub-window.

Feature selection is achieved using the AdaBoost

learning algorithm by constraining each weakclassifier to depend on only a single feature.

Each stage of the boosting process can be viewed as

the feature selection process(selects a new week

classfier)

Wednesday,Nov

ember06,

2013

Realtime

facedetection

8


9/70

CASCADINGOFCLASSIFIERS

Cascade structure which dramatically increases the speedof the detector by focusing attention on promising regions

of the image.

More complex processing is reserved only for these

promising regions. The key measure of such an approach is the false

negativerate of the attentional process.

Those sub-windows which are not rejected by the

initial classifier are processed by a sequence of classifiers,each slightly more complex than the last.

If any classifier rejects the sub-window, no further

processing is performed.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

9


10/70

WHATISFEATURE??????

Why features??Why not pixels???1) Features can act to encode ad-hoc domain knowledge

that is difficult to learn using a finite quantity of training

data.

2) Feature-based system operates much faster than a pixel-based system.

o The simple features used are reminiscent of Haar basis

functions.

o Three kinds of features are use:1) Two-rectangle feature.

2) Three-rectangle feature.

3) Four rectangle feature.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

10


11/70

FEATURESUSED

Given that the base resolution of the detector is2424, the exhaustive set of rectangle features is

quite large, 160,000.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

(unlike the Haar basis, the set orectangle features

is overcomplete.)

11


12/70

INTEGRALIMAGEDESCRIPTION

Rectangle features can be computed very rapidlyusing an intermediate representation for the image

which we call the integral image.

The integral image at location x,y contains the sum of

the pixels above and to the left of x,y ,inclusive.Where ii(x ,y) is the integral image and i(x,y)is the

original image .

Wednesday,Nov

ember06,

2013

Realtime

facedetection

12


13/70

Using the following pair of recurrences:

(where s(x,y) is the cumulative row sum. s(x,1)=0, and

ii(1,y) =0) the integral image can be computed in one

pass over the original image.

Using the integral image any rectangular sum can be

computed in four array references

Wednesday,Nov

ember06,

2013

Realtime

facedetection

13


14/70

LEARNINGCLASSIFICATIONFUNCTIONS

There are 160,000 rectangle features associatedwith each image sub-window, a number far larger

than the number of pixels.

Computing the complete set is prohibitively

expensive. Based on hypothesis it was found out that a very

small number of these features can be combined to

form an effective classifier.

But the main challenge is to find these features. Viola Jones system used AdaBoost to select the

features and to train the classifier.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

14


15/70

ADABOOSTALGORITHM

Adaboost algorithm was developed by Freund and

Schapire and it is one of the most cited paper.

o It comes under the category of Ensemble learning

system where weak learners are boosted to strong

learners which can make very accurate prediction.

o A weak learners or base learners are one which are

slightly better than random guess.

o The algorithm was formulated to answer the question

asked by (Kearns and valiant)

o Whether two complexity classes, weakly learnable

and strongly learnable problems are equal.

Wednesday,Nov

ember06,

2013

Realtime

facedetection

15


16/70

RATIONALE Imagine the situation where we want to build an email

filter that can distinguish spam from non-spam.

The general way we would approach this problem is:

1)Gathering as many examples as possible of both spam

and non-spam emails.

2)Train a classifier using these examples and their labels.

3)Take the learned classifier, or prediction rule, and use it to

filter your mail.

4)The goal is to train a classifier that makes the most

accurate predictions possible on new test examples.

But, building a highly accurate classifier is a difficult task.

(we still get spam)

Wednesday,Nov

ember06,

2013


16


17/70

We could probably come up with many quick

rules of thumb. These could be only moderatelyaccurate.

An example could be if the subject line contains

buy now then classify as spam.

This certainly doesnt cover all spams, but it willbe significantly better than random guessing.

Wednesday,Nov

ember06,

2013


17


18/70

BASICIDEAOFBOOSTING

Boosting refers to a general and provably effective method

of producing a very accurate classifier by combining roughand moderately inaccurate rules of thumb.

It is based on the observation that finding many roughrules of thumb can be a lot easier than finding a single,highly accurate classifier.

To begin, we define an algorithm for finding the rules ofthumb, which we call a weak learner.

The boosting algorithm repeatedly calls this weak learner,each time feeding it a different distribution over the

training data (in Adaboost). Each call generates a weak classifier and we must combine

all of these into a single classifier that, hopefully, is muchmore accurate than any one of the rules.

Wednesday,Nov

ember06,

2013


18


19/70

Wednesday,Nov

ember06,

2013


19


20/70

Wednesday,Nov

ember06,

2013


20


21/70

Wednesday,Nov

ember06,

2013


21


22/70


2013


22


23/70


2013


23


24/70

KEYQUESTIONSDEFININGANDANALYZING

BOOSTING

How should the distribution be chosen each round?

How should the weak rules be combined into a

single rule?

How should the weak learner be defined?

How many weak classifiers should we learn?


2013


24


25/70

GETTINGSTARTED


2013


25


26/70

WEAKLEARNERSANDWEAKCLASSIFIERS


2013


26


27/70

A WL/WC EXAMPLEFORIMAGES


2013


27


28/70


2013


28


29/70


2013


29


30/70


2013


30


31/70

THESTRONGADABOOSTCLASSIFIER


2013


31


32/70


2013


32


33/70

ILLUSTRATIONOFADABOOSTCLASSIFIER


2013


33


34/70

VIOLAJONESAPPROACH

The weak learning algorithm is designed to select

the single rectangle feature which best separatesthe positive and negative examples.

For each feature, the weak learner determines theoptimal threshold classification function, such thatthe minimum number of examples are

misclassified. A weak classifier (h(x, f, p,)) thusconsists of a feature (f), a threshold ()and apolarity (p)indicating the direction of the inequality:


2013

Realtimef

acedetection

34


35/70


2013

Realtimef

acedetection

35


36/70

THEATTATIONALCASCADE

Simpler classifiers are used to reject the majority of

sub-windows before more complex classifiers arecalled upon to achieve low false positive rates.

Stages in the cascade are constructed by training

classifiers using AdaBoost.

Starting with a two-feature strong classifier, an

effective face filter can be obtained by adjusting thestrong classifier threshold to minimize false

negatives


2013

Realtimef

acedetection

36


37/70

HAAR-FEATURE BASED OBJECT

DETECTION ALGORITHM(FRANKVAHID)

Algorithm overview

Image scaling

Haar-feature and integral image

Decision cascade


2013

Realtimef

acedetection

37


38/70


2013


38


39/70


2013


39


40/70

DESIGNED ARCHITECTURE

Wednesday,Nove

mber06,

2013


40


41/70

CLASSIFIERDESIGN

Wednesday,Nove

mber06,

2013


41


42/70


43/70

PARALLELIZEDARCHITECTUREOFMULTIPLE

CLASSIFIERSFORFACEDETECTION(J.CHO)

Face Detection Algorithm.

Integral Image.

Haar Feature

Cascade

Wednesday,Nove

mber06,

2013


43


44/70

Wednesday,Nove

mber06,

2013


44

W


45/70

HARDWARE ARCHITECTURE

Wednesday,Nove

mber06,

2013


45

W


46/70

BLOCKDIAGRAMOFPROPOSEDFACE

DETECTIONSYSTEM

Wednesday,Nove

mber06,

2013


46

W


47/70

ARCHITECTUREFORGENERATINGINTEGRAL

IMAGEWINDOW

Wednesday,Nove

mber06,

2013


47

EW


48/70

EQUATIONSINVOLVEDFORINTEGRALIMAGE

CALCULATION

Wednesday,Nove

mber06,

2013


48

W


49/70

HAARFEATURECALCULATIONOFHAAR

CLASSIFIER.

Wednesday,Nove

mber06,

2013


49

W


50/70

ARCHITECTUREFORPERFORMINGHAAR

CLASSIFICATION.

Wednesday,Nove

mber06,

2013


50

W


51/70

NUMBER OF HAAR CLASSIFIERS IN

EACH STAGE

Wednesday,Nove

mber06,

2013


51


52/70

W


53/70

PERFORMANCE OF PROPOSED FACE

DETECTION SYSTEMS

Wednesday,Nove

mber06,

2013


53

W


54/70

MODIFIEDARCHITECTUREFORREAL-TIME

FACEDETECTIONUSINGFPGA .

The system is based on well-known Viola JonesFrame-work which consists of AdaBoost algorithm

integrated with Haar features.

Modification in hardware design techniques to

achieve more parallel processing and higherdetection speed of the system.

The system implemented on Xilinx Virtex-5

FPGA development board outputs a high face

detection rate (91.3%) at 60 frame/second for a VGA(640 480) video source.

The power consumption of the implementation is 2.1

W.

Wednesday,Nove

mber06,

2013


54

W


55/70

FACEDETECTIONARCHITECTURE

Wednesday,Nove

mber06,

2013


55

W


56/70

FRAMESTOREMODULE

The frame store system comprises of four functions:

Storing the incoming frame line by line.

Sending the stored line to Integral image generator.

Indicating the detected results on stored frame.

o Showing the processed frame out to DVI interface.

Wednesday,Nove

mber06,

2013

Realtimefa

cedetection

56

W


57/70

INTEGRALIMAGEGENERATOR

The Integral Image Generator performs two

functions:

It converts the given 24-bits RGB image into

the 8-bits gray scale.

After conversion it generates the integral image of

the gray scale image, the generation of integral

takes place line by line as the gray scale image

is formed.

So the expression for evaluation of the integral

image is as followed.

II(x,y) = I(k,y) + II(x,y-1)

Wednesday,Nove

mber06,

2013

Realtimefa

cedetection

57


58/70

W


59/70

IMAGESCALARSYSTEM

Instead of applying the image scaling method onthe original image, the algorithm scaled the integral

image only by a factor of 2.

This provides two advantages:

One is that the overall system accuracy

increases due to reduction in scaling error.

Second it only requires alternate selection of datavalue from every alternate lines of the image.

Wednesday,Nove

mber06,

2013

Realtimefa

cedetection

59

W


60/70

INTEGRALSUB-SYSTEM

It comprises of a NxN window, thus there is to design aN2 to 1 multiplexer in order to provide parallel

access to all feature.

(Difference with Architecture-1).

We require a 32 bit N2 to 1 multiplexer, as wegenerating integral of the whole line at a time.

Instead of generating 12 such multiplexer for feature

extraction, we only require 9 multiplexer forextraction.

Wednesday,Nove

mber06,

2013

Realtimefa

cedetection

60

W


61/70

CLASSIFICATIONSYSTEM

The system basically consists of 3 classificationhardware system.

The whole classification is handled by controller

starting from first classifier selection to decisionof each stage.

The controller have the all the value for the system

including stage threshold for every stages, thenumber of each types of classifier in every

stages.

ednesday,Nove

mber06,

2013

Realtimefa

cedetection

61

We


62/70

CLASSIFICATIONSYSTEM

ednesday,Nove

mber06,

2013

Realtimefa

cedetection

62

We


63/70

CLASSIFIERTYPE1 AND2:

ednesday,Nove

mber06,

2013

Realtimefa

cedetection

63

We


64/70

CLASSIFIERTYPE3, 4 AND5:

ednesday,Nove

mber06,

2013

Realtimefa

cedetection

64

We


65/70

CLASSIFIERTYPE5

ednesday,November06,

2013

Realtimefa

cedetection

65

We


66/70

IMPLEMENTATION

The whole system is implementations on Xilinx

Virtex-5 LX110T FPGA Board using VHDL.

Classifier set directly available from OpenCV Face-

classifier system has been used.

The sub-window size taken in this system is of 20x20

and it consist 22 stages and 2135 feature classifier.

The frame store module is implemented on the

SRAM memory chip available on the kit.

The integral image generator is made using the

BRAM available within the FPGA Processor.

The BRAM is configured for 32-bits memory

word which can store up to 1024 such words.


2013

Realtimefa

cedetection

66

It requires 20 active BRAM for 20 lines storage.We


67/70

equ es 0 c ve o 0 es s o ge.

The sub-system is implemented completely on LUTs resourceas well as the multiplexer system.

The classification system consists of both BRAM as wellas LUTs.

Two BRAM modules to store the complete classifier

node position for type 1 & 2 and three BRAM modules fortype 3, 4 & 5.

Both left and right value are 8-bits as well as the largestclassifier threshold is of 16-bits. Hence require only one

memory word to store all this value.

The detected face consists of two BRAM module and storethe position of the window for display.


2013

Realtimefa

cedetection

67

We


68/70

RESOURCE UTILIZATION AND

PERFORMANCE


2013

Realtimefa

cedetection

68

We


69/70

COMPARISONWITHOTHERIMPLEMENTATIONSednesday,November06,

2013

Realtimefa

cedetection

69

We


70/70

RERERENCES.

P. Viola and M. J. Jones, Robust real-timeface detection, Int. J. Comput. Vision, vol. 57, no.

2, pp. 137154, 2004.

C. Huang and F. Vahid, Scalable Object

Detection Accelerators on FPGAs Using CustomDesign Space Exploration, in Proceeding of the

IEEE 9th Symposium on Application Specific

Processors. 2011, pp. 115-121.


2013

Realtimefa

cedetection

70

real time face detection.pptx

Documents