face detection system
TRANSCRIPT
Face detection system
1. INTRODUCTION
1.1Outline of a Typical Face Detection System:
1.1.1. The acquisition module:
This is the entry point of the face recognition process. It is the module where the
face image under consideration is presented to the system. In other words, the user is
asked to present a face image to the face recognition system in this module. An
acquisition module can request a face image from several different environments: The
face image can be an image file that is located on a magnetic disk, it can be captured by
a frame grabber and camera or it can be scanned from paper with the help of a scanner.
1.1.2. The pre-processing module:
In this module, by means of early vision techniques, face images are normalized
and if desired, they are enhanced to improve the recognition performance of the system.
Some or all of the pre-processing steps may be implemented in a face recognition
system
1.1.3. The feature extraction module:
After performing some pre-processing (if necessary), the normalized face image
is presented to the feature extraction module in order to find the key features that are
going to be used for classification. In other words, this module is responsible for
composing a feature vector that is well enough to represent the face image.
1
Face detection system
1.1.4. The classification module:
In this module, with the help of a pattern classifier, extracted features of the face
image is compared with the ones stored in a face library (or face database). After doing
this comparison, face image is classified as either known or unknown.
Principal component analysis, based on information theory concepts, seeks a
computational model that best describes a face, by extracting the most relevant
information contained in that face. Eigenfaces approach is a principal component
analysis method, in which a small set of characteristic pictures are used to describe the
variation between face images. Goal is to find out the eigenvectors (Eigenfaces) of the
covariance matrix of the distribution, spanned by a training set of face images. Later
every face image is represented by a linear combination of these eigenvectors.
Evaluations of these eigenvectors are quite difficult for typical image sizes but,
an approximation that is suitable for practical purposes is also presented. Recognition is
performed by projecting a new image into the subspace spanned by the Eigenfaces
and then classifying the face by comparing its position in face space with the
positions of known individuals.
Eigenfaces approach seems to be an adequate method to be used in face
recognition due to its simplicity, speed and learning capability. Experimental results are
given to demonstrate the viability of the proposed “face detection method”.
2
Face detection system
1.2 Definition:
Face detection is concerned with finding whether or not there are any faces in a
given image (usually in gray scale) and, if present,return the image location and content
of each face. This is the first step of any fully automatic system that analyzes the info-
mation contained in faces e.g., identity, gender, expression, age, race and pose).While
earlier work dealt mainly with upright frontal faces, several systems have been dovelo-
ped that are able to detect faces fairly accurately with in-plane or out-of-plane
rotations in real time. Although a face detection module is typically designed to deal
with single images,its performance can be further improved if video stream is available.
The advances of computing technology have facilitated the development of rea-
ltime vision modules that interact with humans in recent years. Examples abound, part-
icularly in biometrics and human computer interaction as the information contained
faces needs to be analyzed for systems to react accordingly. For biometric systemsthat
use faces as non-intrusive input modules, it is imperative to locate faces in a scene bef-
ore any recognition algorithm can be applied. An intelligent visionbased user interface
should be able to tell the attention focus of the user (i.e., where the user is looking at)
in order torespond accordingly. To detect facial features accurately for applications su-
ch as digital cosmetics, faces need to be located and registered first to facilitate further
processing. It is evident that face detection plays an important and critical role for the
success of any face processing systems.
The face detection problem is challenging as it needs to account for all possible
appearance variation caused by change in illumination, facial features, occlusions, etc. In
addition, it has to detect faces that appear at different scale, pose, with inplane rotations.
In spite of all these difficulties, tremendous progress has been made in the last decade
and many systems have shown impressive real-time performance. The recent advances
of these algorithms have also made significant contributions in detecting other objects
such as humans/pedestrians, and cars. Operation of a Face Detection System Most
detection systems carry out the task by extracting certain properties (e.g., local features
3
Face detection system
or holistic intensity patterns) of a set of training images acquired at a fixed pose (e.g.,
upright frontal pose) in an off-line setting. To reduce the effects of illumination change,
these images are processed with histogram equalization [3, 1] or standardization (i.e.,
zero mean unit variance) [2]. Based on the extracted properties, these systems typically
scan through the entire image at every possible location and scale in order to locate
faces. The extracted properties can be either manually coded (with human knowledge) or
learned from a set of data as adopted in the recent systems that have demonstrated
impressive results [3, 1, 4, 5, 2].
In order to detect faces at different scale, the detection process is usually
repeated to a pyramid of images whose resolution are reduced by a certain factor (e.g.,
1.2) from the original one [3, 1]. Such procedures may be expedited when other visual
cues can be accurately incorporated (e.g., color and motion) as pre-processing steps to
reduce the search space . As faces are often detected across scale, the raw detected
faces are usually further processed to combine overlapped results and remove false
positives with heuristics (e.g., faces typically do not overlap in images) or further
processing (e.g., edge detection and intensity variance).
Numerous representations have been proposed for face detection, including
pixel-based [3, 1, 5], parts-based [6, 4, 7], local edge features [8, 9], Haar wavelets [10,
4] and Haar-like features [2, 11]. While earlier holistic representation schemes are able
to detect faces [3, 1, 5], the recent systems with Haar-like features [2, 12, 13] have
demonstrated impressive empirical results in detecting faces under occlusion. A large
and representative training set of face images is essential for the success of learning-
based face detectors. From the set of collected data, more positive examples can be
synthetically generated by perturbing, mirroring, rotating and scaling the original face
images [3, 1]. On the other hand, it is relatively easier to collect negative examples by
randomly sampling images without face images [3, 1].
4
Face detection system
As face detection can be mainly formulated as a pattern recognition problem,
numerous algorithms have been proposed to learn their generic templates (e.g.,
eigenface and statistical distribution) or discriminant classifiers (e.g., neural networks,
Fisher linear discriminant, sparse network of Winnows, decision tree, Bayes classifiers,
support vector machines, and AdaBoost).
Typically, a good face detection system needs to be trained with several
iterations. One common method to further improve the system is to bootstrap a trained
face detector with test sets, and re-train the system with the false positive as well as
negatives . This process is repeated several times in order to further improve the
performance of a face detector. A survey on these topics can be found in , and the most
recent advances are discussed in the next section.
1.3 Recent Advances:
The AdaBoost-based face detector by Viola and Jones demonstrated that faces
can be fairly reliably detected in real-time (i.e., more than 15 frames per second on 320
by 240 images with desktop computers) under partial occlusion. While Haar
wavelets were used in for representing faces and pedestrians, they proposed the use of
Haar-like features which can be computed efficiently with integral image . Figure 1
shows four types of Haar-like features that are used to encode the horizontal, vertical
and diagonal intensity information of face images at different position and scale. Given a
sample image of 24 by 24 pixels, the exhaustive set of parameterized Haar-like features
(at different position and scale) is very large (about 160,000). Contrary to most of the
prior algorithms that use one single strong classifier (e.g., neural networks and support
vector machines), they used an ensemble of weak classifiers where each one is
constructed by thresholding of one Haar-like feature. The weak classifiers are selected
and weighted using the AdaBoost algorithm . As there are large number of weak
classifiers, they presented a method to rank these classifiers into several cascades using a
set of optimization criteria.
5
Face detection system
Within each stage, an ensemble of several weak classifiers is trained using the
AdaBoost algorithm. The motivation behind the cascade of classifier is that simple
classifiers at early stage can filter out most negative examples efficiently, and stronger
classifiers at later stage are only necessary to deal with instances that look like faces.
The final detector, a 38 layer cascade of classifiers with 6,060 Haar-like features,
demonstrated impressive real-time performance with fairly high detection and low false
positive rates. Several extensions to detect faces in multiple views with in-plane ration
have since been proposed. An implementation of the AdaBoost-based face detector can
be found in the Intel OpenCV library.
Despite the excellent run-time performance of boosted cascade classifier, the
training time of such a system is rather lengthy. In addition, the classifier cascade is an
example of degenerate decision tree with an unbalanced data set (i.e., a small set of
positive examples and a huge set of negative ones). Numerous algorithms have been
proposed to address these issues and extended to detect faces in multiple views. To
handle the asymmetry between the positive and negative data sets, Viola and Jones
proposed the asymmetric AdaBoost algorithm which keeps most of the weights on the
the positive examples.
The AdaBoost algorithm is used to select a specified number of weak classifiers
with lowest error rates for each cascade and the process is repeated until a set of
optimization criteria (i.e., the number of stages, the number of features of each stage,and
the detection/false positive rates) is satisfied. As each weak classifier is made of one
single Haar-like feature, the process within each stage can be considered as a feature
selection problem. Instead of repeating the feature selection process at each stage, Wu et
al. presented a greedy algorithm for determining the set of features for all stages first
before training the cascade classifier. With the greedy feature selection algorithm used
as a pre-computing procedure, they reported that the training time of the classifier
cascade with AdaBoost is reduced by 50 to 100 times. For learning in each stage (or
node within the classifier cascade, they also exploited the asymmetry between positive
6
Face detection system
and negative data using a linear classifier with the assumptions that they can be modeled
with Gaussian distributions . The merits and drawbacks of the proposed linear
asymmetric classifier as well as the classic Fisher linear discriminant were also
examined in their work. Recently, Pham and Cham proposed an online algorithm that
learns asymmetric boosted classifiers with significant gain in training time. In an
algorithm that aims to automatically determine the number of classifiers and stages for
constructing a boosted ensemble was proposed. While a greedy optimization algorithm
was employed in Brubaker et al. proposed an algorithm for determining the number of
weak classifiers and training each node classifier of a cascade by selecting operating
points within a receiver operator characteristic (ROC) curve . The solved the
optimization problem using linear programs that maximize the detection rates while
satisfying the constraints of false positive rates.
Although the original four types of Haar-like features are sufficient to encode
upright frontal face images, other types of features are essential to represent more
complex patterns (e.g., faces in different pose). Most systems take a divide-and-conquer
strategy and a face detector is constructed for a fixed pose, thereby covering a wide
range of angles (e.g., yaw and pitch angles). A test image is either sent to all detectors
for evaluation, or to a decision module with a coarse pose estimator for selecting the
appropriate trees for further processing. The ensuing problems are how the types of
features are constructed, and how the most important ones from a large feature space are
selected. More generalized Haar-like features are defined in in which the rectangular
image regions are not necessarily adjacent, and furthermore the number of such
rectangular blocks is randomly varied . Several greedy algorithms have been proposed to
select features efficiently by exploiting the statistics of features before training boosted
cascade classifiers.
There are also other fast face detection methods that demonstrate promising
results, including the component-based face detector using Naive Bayes classifiers , the
face detectors using support vector machines, the Anti-face method which consists of a
7
Face detection system
series of detectors trained with positive images only, and the energy-based method that
simultaneously detects faces and estimates their pose in real time.
1.4 Quantifying Performance:
There are numerous metrics to gauge the performance of face detection systems,
ranging from detection frame rate, false positive/negative rate, number of classifier,
number of feature, number of training image, training time, accuracy and memory
requirements. In addition, the reported performance also depends on the definition of a
“correct” detection result. Figure 2 shows the effects of detection results versus different
criteria, and more discussions can be found in.
The most commonly adopted method is to plot the ROC curve using the de facto
standard MIT + CMU data set which contains frontal face images. Another data set
from CMU contains images with faces that vary in pose from frontal to side view .
It has been noticed that although the face detection methods nowadays have impressive
real-time performance, there is still much room for improvement in terms of accuracy.
The detected faces returned by state-of-the-art algorithms are often a few pixels off the
“accurate” locations, which is significant as face images are usually standardized to 21
by 21 pixels. While such results are the trade-offs between speed, robustness and
accuracy, they inevitably degrade the performance of any biometric applications using
the contents of detected faces. Several post-processing algorithms have been proposed to
better locate faces and extract facial features (when the image resolution of the detected
faces is sufficiently high).
8
Face detection system
1.5 Applications
As face detection is the first step of any face processing system, it finds
numerous applications in face recognition, face tracking, facial expression recognition,
facial feature extraction, gender classification, clustering, attentive user interfaces,
digital cosmetics, biometric systems, to name a few. In addition, most of the face
detection algorithms can be extended to recognize other objects such as cars, humans,
pedestrians, and signs, etc.
2. MATLAB
9
Face detection system
2.1 MATLAB deals with:
1. Basic flow control and programming language
2. How to write scripts (main functions) with mat lab
3. How to write functions with mat lab
4. How to use the debugger
5. How to use the graphical interface
6. Examples of useful scripts and functions for image processing
After learning about mat lab we will be able to use matlab as a tool to help us
with our maths, electronics, signal & image processing, statistics, neural networks,
control and automation.
2.2 Matlab resources:
Language: High level matrix/vector language with
Scripts and main programs
Functions
Flow statements (for, while)
Control statements (if,else)
data structures (struct, cells)
input/ouputs (read,write,save)
object oriented programming.
Environment
10
Face detection system
Command window.
Editor
Debugger
Profiler (evaluate performances)
Mathematical libraries
Vast collection of functions
API
Call c function from matlab
Call matlab functions from c
Scripts and main programs
In matlab, scripts are the equivalent of main programs. The variables declared in
a script are visible in the workspace and they can be saved. Scripts can therefore take a
lot of memory if you are not careful, especially when dealing with images. To create a
script, you will need to start the editor, write your code and run it.
2.3 MATLAB Functions:
imread: Read images from graphics files.
Syntax:
A = imread(filename,fmt)
[X,map] = imread(filename,fmt)
[...] = imread(filename)
[...] = imread(...,idx) (TIFF only)
[...] = imread(...,ref) (HDF only)
[...] = imread(...,'BackgroundColor',BG) (PNG only)
2.4 Description:
11
Face detection system
A = imread(filename,fmt) reads a grayscale or truecolor image named
filename into A. If the file contains a grayscale intensity image, A is a two-dimensional
array. If the file contains a truecolor (RGB) image, A is a three-dimensional array.
[X,map] = imread(filename,fmt) reads the indexed image in filename into X and
its associated colormap into map. The colormap values are rescaled to the range [0,1]. A
and map are two-dimensional arrays.
[...] = imread(filename) attempts to infer the format of the file from its content.
filename is a string that specifies the name of the graphics file, and fmt is a string that
specifies the format of the file. If the file is not in the current directory or in a directory
in the MATLAB path, specify the full pathname for a location on your system. If
imread cannot find a file named filename, it looks for a file named filename.fmt. If
you do not specify a string for fmt, the toolbox will try to discern the format of the file
by checking the file header.
Format File type
'bmp' Windows Bitmap (BMP)
'hdf' Hierarchical Data Format (HDF)
'jpg' or 'jpeg' Joint Photographic Experts Group (JPEG)
'pcx' Windows Paintbrush (PCX)
`png' Portable Network Graphics (PNG)
'tif' or 'tiff' Tagged Image File Format (TIFF)
'xwd' X Windows Dump (XWD)
Table 2.1: possible values for fmt.
Special Case Syntax:
12
Face detection system
TIFF-Specific Syntax:
[...] = imread(...,idx) reads in one image from a multi-image TIFF file. idx is an
integer value that specifies the order in which the image appears in the file. For example,
if idx is 3, imread reads the third image in the file. If you omit this argument, imread
reads the first image in the file. To read all ages of a TIFF file, omit the idx argument.
2.5 PNG-Specific Syntax:
The discussion in this section is only relevant to PNG files that contain
transparent pixels. A PNG file does not necessarily contain transparency data.
Transparent pixels, when they exist, will be identified by one of two components: a
transparency chunk or an alpha channel.
The transparency chunk identifies which pixel values will be treated as
transparent, e.g., if the value in the transparency chunk of an 8-bit image is 0.5020, all
pixels in the image with the color 0.5020 can be displayed as transparent. An alpha
channel is an array with the same number of pixels as are in the image, which indicates
the transparency status of each corresponding pixel in the image (transparent or
nontransparent).
Another potential PNG component related to transparency is the background
color chunk, which (if present) defines a color value that can be used behind all
transparent pixels. This section identifies the default behavior of the toolbox for reading
PNG images that contain either a transparency chunk or an alpha channel, and describes
how you can override it.
13
Face detection system
Case 1. You do not ask to output the alpha channel and do not specify a background
color to use. For example,
[a,map] = imread(filename);
a = imread(filename);
If the PNG file contains a background color chunk, the transparent pixels will be
composited against the specified background color.
If the PNG file does not contain a background color chunk, the transparent pixels will be
composited against 0 for grayscale (black), 1 for indexed (first color in map), or [0 0
0] for RGB (black).
Case 2. You do not ask to output the alpha channel but you specify the background color
parameter in your call. For example,
[...] = imread(...,'BackgroundColor',bg);
The transparent pixels will be composited against the specified color. The form of bg
depends on whether the file contains an indexed, intensity (grayscale), or RGB image. If
the input image is indexed, bg should be an integer in the range [1,P] where P is the
colormap length. If the input image is intensity, bg should be an integer in the range
[0,1]. If the input image is RGB, bg should be a 3-element vector whose values are in
the range [0,1].
There is one exception to the toolbox's behavior of using your background color. If you
set background to 'none' no compositing will be performed. For example,
[...] = imread(...,'Back','none');
Case 3. You ask to get the alpha channel as an output variable. For example,
[a,map,alpha] = imread(filename);
[a,map,alpha] = imread(filename,fmt);
No compositing is performed; the alpha channel will be stored separately from
the image (not merged into the image as in cases 1 and 2). This form of imread returns
the alpha channel if one is present, and also returns the image and any associated
14
Face detection system
colormap. If there is no alpha channel, alpha returns []. If there is no colormap, or the
image is grayscale or truecolor, map may be empty.
2.6 HDF-Specific Syntax:
[...] = imread(...,ref) reads in one image from a multi-image HDF file. ref
is an integer value that specifies the reference number used to identify the image. For
example, if ref is 12, imread reads the image whose reference number is 12. (Note that
in an HDF file the reference numbers do not necessarily correspond to the order of the
images in the file. You can use imfinfo to match up image order with reference
number.) If you omit this argument, imread reads the first image in the file.
Forma
tVariants
BMP1-bit, 4-bit, 8-bit, and 24-bit uncompressed images; 4-bit and 8-bit run-
length encoded (RLE) images
HDF8-bit raster image datasets, with or without associated colormap; 24-bit
raster image datasets
JPEGAny baseline JPEG image (8 or 24-bit); JPEG images with some commonly
used extensions
PCX 1-bit, 8-bit, and 24-bit images
PNGAny PNG image, including 1-bit, 2-bit, 4-bit, 8-bit, and 16-bit grayscale
images; 8-bit and 16-bit indexed images; 24-bit and 48-bit RGB images
TIFF
Any baseline TIFF image, including 1-bit, 8-bit, and 24-bit uncompressed
images; 1-bit, 8-bit, 16-bit, and 24-bit images with packbits compression; 1-
bit images with CCITT compression; also 16-bit grayscale, 16-bit indexed,
and 48-bit RGB images.
XWD 1-bit and 8-bit ZPixmaps; XYBitmaps; 1-bit XYPixmaps
Table 2.2: Types of images that imread can read.
15
Face detection system
Examples:
This example reads the sixth image in a TIFF file:
[X,map] = imread('flowers.tif',6);
This example reads the fourth image in an HDF file.
info = imfinfo('skull.hdf');
[X,map] = imread('skull.hdf',info(4).Reference);
This example reads a 24-bit PNG image and sets any of its fully transparent (alpha
channel) pixels to red.
bg = [255 0 0];
A = imread('image.png','BackgroundColor',bg);
This example returns the alpha channel (if any) of a PNG image.
[A,map,alpha] = imread('image.png');
imshow: Display image
Syntax
imshow(I)
imshow(I,[low high])
imshow(RGB)
imshow(BW)
imshow(X,map)
imshow(filename)
himage = imshow(...)
imshow(..., param1, val1, param2, val2,...)
16
Face detection system
2.7 Description:
imshow(I) displays the grayscale image I.
imshow(I,[low high]) displays the grayscale image I, specifying the display
range for I in [low high]. The value low (and any value less than low) displays as black;
the value high (and any value greater than high) displays as white. Values in between are
displayed as intermediate shades of gray, using the default number of gray levels. If you
use an empty matrix ([]) for [low high], imshow uses [min(I(:)) max(I(:))]; that is, the
minimum value in I is displayed as black, and the maximum value is displayed as white.
imshow(RGB) displays the truecolor image RGB.
imshow(BW) displays the binary image BW. imshow displays pixels with the value 0
(zero) as black and pixels with the value 1 as white.
imshow(X,map) displays the indexed image X with the colormap map. A color
map matrix may have any number of rows, but it must have exactly 3 columns. Each
row is interpreted as a color, with the first element specifying the intensity of red light,
the second green, and the third blue. Color intensity can be specified on the interval 0.0
to 1.0.
imshow(filename) displays the image stored in the graphics file filename. The
file must contain an image that can be read by imread or dicomread. imshow calls
imread or dicomread to read the image from the file, but does not store the image data in
the MATLAB workspace. If the file contains multiple images, the first one will be
displayed. The file must be in the current directory or on the MATLAB path.
17
Face detection system
2.8 Remarks
imshow is the toolbox's fundamental image display function, optimizing figure,
axes, and image object property settings for image display. imtool provides all the image
display capabilities of imshow but also provides access to several other tools for
navigating and exploring images, such as the Pixel Region tool, Image Information tool,
and the Adjust Contrast tool. imtool presents an integrated environment for displaying
images and performing some common image processing tasks.
Examples
Display an image from a file.
X= imread('moon.tif'); imshow(X).
18
Face detection system
3. DEFINITIONAL ENTRIES
3.1 AdaBoost:
AdaBoost (short for Adaptive Boosting) is a machine learning algorithm
formulated by Freund and Schapire that learns a strong classifier by combining an
ensemble of weak (moderately accurate) classifiers with weights. The discrete AdaBoost
algorithm was originally developed for classification using the exponential loss function
and is an instance within the boosting
family.
3.2 Haar-like features:
Similar to the what Haar wavelets are developed for basis functions to encode
signals, the objective of two-dimensional Haar features is to collect local oriented
intensity difference at different scale for representing image patters. This representation
transforms an image from pixel space to the space of wavelet coefficients with an over-
complete dictionary of features. See for how such features can be used to represent face
and pedestrians images. The Haar-like features, similar to Haar wavelets, compute local
oriented intensity difference using rectangular blocks (rather than pixels) which can be
computed efficiently with the integral image.
3.3 ROC curve:
An ROC (receiver operating characteristic) curve is a plot commonly used in
machine learning and data mining for exhibiting the performance of a classifier under
different criteria. The y-axis is the true positive and the x-axis is the false positive
(i.e.,false alarm). A point on ROC curve shows that the trade-off between the achieved
true positive detection rate and the accepted false positive rate.
19
Face detection system
3.4Classifier cascade:
In face detection, a classifier cascade is a degenerate decision tree where each
node (decision stump) consists of a binary classifier. In [2], each node is a boosted
classifier consisting of several weak classifiers. These boosted classifiers are constructed
so that the ones near the root can be computed very efficiently at very high detection rate
with acceptable false positive rate.
Typically, most patches in a test image can be classified as faces/non-faces using
simple classifiers near the root, and relatively few difficult ones need to be analyzed by
nodes with deeper depth. With this cascade structure, the total computation of examining
all scanned image patches can be reduced significantly.
Fig: 3.1.a Face images 3.1.b.Non face images
Fig. 1. Four types of Haar-like features. These features appear at different
position and scale. The Haar-like features are computed as the difference of dark and
light regions. They can be considered as features that collect local edge information at
different orientation and scale. The set of Haar-like features is large, and only a small
amount of them are learned from positive and negative examples for face detection.
20
Face detection system
Fig: 3.2 a Test image 3.2.b Detection Results
Fig.3.2 . Detection results depend heavily on the adopted criteria. Suppose all the
sub-images in (b) are returned as face patterns by a detector.A loose criterion may
declare all the faces as “successful” detections while a more strict one non-faces
gabor.m
This script contains Gabor equation and is used to generate one based on some
parameters.
create_gabor.m
21
Face detection system
This script uses gabor.m to generate forty 32x32 gabor filters and save them in a
cell array matrix called “G” and in a file named “ gabor.mat”. This script will be
inkoved only once unless we delete “gabor.mat”.
Fig: 3.3 Gabor Filters in Time Domain
main.m
The main menu and the only file you need to run the program
createffnn.m
This function creates a feed forward neural network with one hundred neurons
in the hidden layer and one neuron in the output layer. The network will be saved in
“net.mat” for further use. To learn more about how to customized neural network see
“MATLAB help > Neural Network Toolbox > Advance Topics”
22
Face detection system
loadimages.m
This function prepares images for training phase. All data form both “face” and
“non-face” folders will be gathered in a large cell array. Each column represents the
features of an image which could be a face or not Rows are as follows:
Row 1: File name
Row 2: Desired output of the network corresponded to the feature vector.
Row 3: Prepared vector for the training phase
Also this script saves the database to a file named “imgdb.mat”. So we do not
need to create the database more than once unless we add or delete some photos to/from
“face” and “non-face” folders.
Every time we do this, after recreating a database, we should initialize and train
the network again This script uses “im2vec.m” to extract features from images and
vectorize them for the database.
23
Face detection system
.
im2vec.m
This function takes a 27x18 image. It adjusts the histogram of the image for
better contrast. Then the image will be convolved with gabor filters by multiplying the
image by gabor filters in frequency domain. Gabor filters are stored in “gabor.m”. To
save time they have been saved in frequency domain before Features135x144 is a cell
array contains the result of the colvolution of the image with each of the forty gabor
filters. These matrixes will be concated to form a bif 135x144 matrix of complex
numbers. we only need the magnitude of the result. That is why “abs” is used.
135x144 has 10,400 pixels. It means that the input vector of the network should
have 19,400 values which mean a large amount of computation. So we reduce the matrix
size to one-third of its original size by deleting some rows and columns. Deleting is not
the best way but it save more time compare to other methods like PCA
We should optimize this function as possible as we can.
Fig: 3.4
Trainnet.m
This function trains the neural network and returns the trained network.
24
Face detection system
imscan.m
First Section:
Fig: 3.5
Second Section:
In this section the algorithm checks all potential face-contained windows and the
windows around them using neural network. The result will be the output of the neural
network for checked regions
Fig :3.6
25
Face detection system
Third Section:
1- Filtering above pattern for values above threshold (xy_)
2- Dilating pattern with a disk structure (xy_)
3- finding the center of each region
26
Face detection system
4- Draw a rectangle for each point
5 - Final Result
27
Face detection system
4. SOURCE CODE
4.1 How to run the program:
1- Copy all files and directories to the MATLAB’s work folder.
(you may also create a folder there to avoid confliction with other programs)
2- Find the file named “main.m”
3- Double click on the file or type “main” in the command window
For the first time the program will create three files automatically
gabor.mat: this file contains a cell array matrix called “G”. Forty gabor filters are
Stored in “G” in frequency domain each of which has a resolution of 32x32
net.mat: feed forward neural network structure
imgdb.mat: All Images which are going to be used in training
4- A menu will be shown. Click on “Train Network” and wait until the program trains
your neural network.
5- Click on “Test on Photos”. A dialog box will be appeared. Select a .jpg photo.
” Im1.jpg” is a small image which is good for your first visit of the program
Your selected photo will be shown on the screen. You can maximize the window if
you want.
6- Wait until the program detects some faces. During this phase you should see some
activities on the selected photo.
4.2 Requirments:
1- MATLAB 7.0 or Later
2- Image Processing Toolbox
3- Neural Network Toolbox
28
Face detection system
4.3 MATLAB Program:
% Face recognition by Santiago Serrano
clear all
close all
clc
% number of images on your training set.
M=40;
% Chosen std and mean.
% It can be any number that it is close to the std and mean of most of the images.
um=100;
ustd=80;
% read and show image
S=[]; % img matrix
figure(1);
for i=1:M
str=strcat(int2str(i),'.bmp'); % concatenates two strings that form the name of the
image
eval('img=imread(str);');
subplot(ceil(sqrt(M)),ceil(sqrt(M)),i)
imshow(img)
if i==3
title('Training set','fontsize',18)
end
drawnow;
29
Face detection system
[irow icol]=size(img); % get the number of rows (N1) and columns (N2)
temp=reshape(img',irow*icol,1); % creates a (N1*N2)x1 vector
S=[S temp]; % S is a N1*N2xM matrix after finishing the sequence
End
% Here we change the mean and std of all images. We normalize all images.
% This is done to reduce the error due to lighting conditions and background.
for i=1:size(S,2)
temp=double(S(:,i));
m=mean(temp);
st=std(temp);
S(:,i)=(temp-m)*ustd/st+um;
end
% show normalized images
figure(2);
for i=1:M
str=strcat(int2str(i),'.jpg');
img=reshape(S(:,i),icol,irow);
img=img';
eval('imwrite(img,str)');
subplot(ceil(sqrt(M)),ceil(sqrt(M)),i)
imshow(img)
drawnow;
if i==3
title('Normalized Training Set','fontsize',18)
end
end
% mean image
m=mean(S,2); % obtains the mean of each row instead of each column
30
Face detection system
tmimg=uint8(m); % converts to unsigned 8-bit integer. Values range from 0 to 255
img=reshape(tmimg,icol,irow); % takes the N1*N2x1 vector and creates a N1xN2
matrix
img=img';
figure(3);
imshow(img);
title('Mean Image','fontsize',18)
% Change image for manipulation
dbx=[]; % A matrix
for i=1:M
temp=double(S(:,i));
dbx=[dbx temp];
end
%Covariance matrix C=A'A, L=AA'
A=dbx';
L=A*A';
% vv are the eigenvector for L
% dd are the eigenvalue for both L=dbx'*dbx and C=dbx*dbx';
[vv dd]=eig(L);
% Sort and eliminate those whose eigenvalue is zero
v=[];
d=[];
for i=1:size(vv,2)
if(dd(i,i)>1e-4)
v=[v vv(:,i)];
d=[d dd(i,i)];
end
31
Face detection system
end
%sort, will return an ascending sequence
[B index]=sort(d);
ind=zeros(size(index));
dtemp=zeros(size(index));
vtemp=zeros(size(v));
len=length(index);
for i=1:len
dtemp(i)=B(len+1-i);
ind(i)=len+1-index(i);
vtemp(:,ind(i))=v(:,i);
end
d=dtemp;
v=vtemp;
%Normalization of eigenvectors
for i=1:size(v,2) %access each column
kk=v(:,i);
temp=sqrt(sum(kk.^2));
v(:,i)=v(:,i)./temp;
end
%Eigenvectors of C matrix
u=[];
for i=1:size(v,2)
temp=sqrt(d(i));
u=[u (dbx*v(:,i))./temp];
end
32
Face detection system
%Normalization of eigenvectors
for i=1:size(u,2)
kk=u(:,i);
temp=sqrt(sum(kk.^2));
u(:,i)=u(:,i)./temp;
end
% show eigenfaces
figure(4);
for i=1:size(u,2)
img=reshape(u(:,i),icol,irow);
img=img';
img=histeq(img,255);
subplot(ceil(sqrt(M)),ceil(sqrt(M)),i)
imshow(img)
drawnow;
if i==3
title('Eigenfaces','fontsize',18)
end
end
% Find the weight of each face in the training set
omega = [];
for h=1:size(dbx,2)
WW=[];
for i=1:size(u,2)
t = u(:,i)';
33
Face detection system
WeightOfImage = dot(t,dbx(:,h)');
WW = [WW; WeightOfImage];
end
omega = [omega WW];
end
% Acquire new image
% Note: the input image must have a bmp or jpg extension.
% It should have the same size as the ones in your training set.
% It should be placed on your desktop
%InputImage = input('Please enter the name of the image and its extension \n','s');
InputImage = imread('1.bmp');
%InputImage = imread(strcat('D:\Documents and Settings\user\Desktop\face
recognition\',InputImage));
figure(5)
subplot(1,2,1)
imshow(InputImage); colormap('gray');title('Input image','fontsize',18)
InImage=reshape(double(InputImage)',irow*icol,1);
temp=InImage;
me=mean(temp);
st=std(temp);
temp=(temp-me)*ustd/st+um;
NormImage = temp;
Difference = temp-m;
p = [];
aa=size(u,2);
34
Face detection system
for i = 1:aa
pare = dot(NormImage,u(:,i));
p = [p; pare];
end
ReshapedImage = m + u(:,1:aa)*p; %m is the mean image, u is the eigenvector
ReshapedImage = reshape(ReshapedImage,icol,irow);
ReshapedImage = ReshapedImage';
%show the reconstructed image.
subplot(1,2,2)
imagesc(ReshapedImage); colormap('gray');
title('Reconstructed image','fontsize',18)
InImWeight = [];
for i=1:size(u,2)
t = u(:,i)';
WeightOfInputImage = dot(t,Difference');
InImWeight = [InImWeight; WeightOfInputImage];
end
ll = 1:M;
figure(68)
subplot(1,2,1)
stem(ll,InImWeight)
title('Weight of Input Face','fontsize',14)
% Find Euclidean distance
e=[];
for i=1:size(omega,2)
q = omega(:,i);
DiffWeight = InImWeight-q;
35
Face detection system
mag = norm(DiffWeight);
e = [e mag];
end
kk = 1:size(e,2);
subplot(1,2,2)
stem(kk,e)
title('Eucledian distance of input image','fontsize',14)
MaximumValue=max(e) % maximum eucledian distance
MinimumValue=min(e) % minimum eucledian distance
5. FUTURE SCOPE
36
Face detection system
Face Detection is the First Step in Face Recoganization system.
General Face Recognition Steps :
A) Face Detection
B) Face Normalization
C) Face Identification
In future we are going to do our main project on “FACE RECOGANIZATION
SYSTEM”.
37
Face detection system
6. CONCLUSION
As face detection is the first step of any face processing system, it finds
numerous applications in face recognition, face tracking, facial expression recognition,
facial feature extraction, gender classification, clustering, attentive user interfaces,
digital cosmetics, biometric systems, to name a few. In addition, most of the face
detection algorithms can be extended to recognize other objects such as cars, humans,
pedestrians, and signs, etc...
Face Recognition has been successfully implemented using eigenface approach.
Eigenface approach of face recognition has been found to be a robust technique that can
be used in security systems.
7. BIBILOGRAPHY
38
Face detection system
S.N.o Book Name Author Year
1. Authenticated Key Exchange Secure Against
Dictionary Attacks
M.Bellare
D. Pointcheval, P.
Rogaway
2000
2. Fingerprint image Enhancement Algorithm
and Performance Evaluation
L. Hong,
Y. Wan, and A.
Jain
1998
3. A New Two – Server Approach for
Authentication with Short Secrets
J. Brainard,
A. Juels,
B.Kaliski,
and M. Szydlo
2003
References on the Web:
www.mathworks.com
http://www.analog.com
http://www.intechopen.com
http://ieeexplore.ieee.org
39
Face detection system
40