face recognition under varying illuminations using logarithmic fractal dimension-based ... · 2016....

$: Face recognition under varying illuminations using logarithmic fractal dimension-based ... · 2016. 6. 22. · edge responses, to obtain illumination-invariant representations. The$
Neurocomputing 199 (2016) 16–30

Contents lists available at ScienceDirect

Neurocomputing

http://d0925-23

n CorrE-m

journal homepage: www.elsevier.com/locate/neucom

Face recognition under varying illuminations using logarithmic fractaldimension-based complete eight local directional patterns

Mohammad Reza Faraji n, Xiaojun QiDepartment of Computer Science, Utah State University, Logan, UT 84322-4205, USA

a r t i c l e i n f o

Article history:Received 8 October 2015Received in revised form6 January 2016Accepted 31 January 2016Available online 16 March 2016

Keywords:Face recognitionIllumination variationsComplete eight local directional patternsLogarithmic fractal dimension

x.doi.org/10.1016/j.neucom.2016.01.09412/& 2016 Elsevier B.V. All rights reserved.

esponding author:ail address: Mohammadereza.Faraji@aggiemai

a b s t r a c t

Face recognition under illumination is really challenging. This paper proposes an effective method toproduce illumination-invariant features for images with various levels of illumination. The proposedmethod seamlessly combines adaptive homomorphic filtering, simplified logarithmic fractal dimension,and complete eight local directional patterns to produce illumination-invariant representations. Ourextensive experiments show that the proposed method outperforms two of its variant methods and ninestate-of-the-art methods, and achieves the overall face recognition accuracy of 99.47%, 94.55%, 99.53%,and 86.63% on Yale B, extended Yale B, CMU-PIE, and AR face databases, respectively, when using oneimage per subject for training. It also outperforms the compared methods on the Honda UCSD videodatabase using five images per subject for training and considering all necessary steps including facedetection, landmark localization, face normalization, and face matching to recognize faces. Our evalua-tions using receiver operating characteristic (ROC) curves also verify the proposed method has the bestverification and discrimination ability compared with other peer methods.

& 2016 Elsevier B.V. All rights reserved.

1. Introduction

Face recognition with a wide range of applications in security,forensic investigation, and law enforcement is a good compromisebetween reliability and social acceptance [1,2]. A typical facerecognition system includes four main stages (shown in Fig. 1 [1]):face detection, landmark localization, face normalization, and facematching. In the first stage, face detectors such as the SNoW-based(sparse network of windows) face detector [3], the Viola-Jonesdetector [4], and the skin-color-pixel-based Viola-Jones face detec-tor [5] detect faces in images or video frames. In the second stage,respective facial landmarks are localized for each detected face.Some of recently proposed facial landmark localization techniquesinclude boosted regression with Markov networks (BoRMaN) [6],local evidence aggregation for regression-based facial point detec-tion (LEAR) [7], discriminative response map fitting (DRMF) [8], andChehra v.1 (meaning “face” in hindi) [9]. In the third stage, detectedfaces are geometrically normalized and corrected by facial land-marks. First, geometric rectification is applied to normalize the pose.Pose normalization then uses facial landmarks to estimate thefrontal view for each face [10]. Face images can be automaticallycropped using facial landmarks, since landmarks exactly indicate thelocations of key facial points in the image. The illumination

l.usu.edu (M.R. Faraji).

preprocessing method is finally applied on the cropped images toobtain illumination-insensitive face images. In this stage, if faceimages contain facial expressions such as smile, anger, or scream,Gabor features-based expression normalization can be employed toextract facial features that are robust to expression variations [11,12].In the final stage, a face matching process is used on the normalizedfaces to recognize the identity of the faces. This stage includes fea-ture extraction and face classification. In feature extraction, each face isconverted to a one-dimensional vector to be used for classification.Each face might also be divided into a regular grid of cells to extracthistograms [13]. Dimension reduction methods can also be used toreduce the dimension of the data before a classifier is employed toverify the identity of the input face image. Representative dimensionreduction methods include principal component analysis (PCA) [14–16], independent component analysis (ICA) [17,15], kernel principalcomponent analysis (KPCA) [18], Fisher's linear discriminant analysis(FDA) [19], kernel Fisher's linear discriminant analysis (KFDA)[20,21], and singular value decomposition (SVD). Finally, the faceclassification stage uses the extracted features to classify and verify anew unknown face image with respect to the image database. Someof the popular classification methods used in face recognition sys-tems are k nearest neighbors [22,23], neural-networks [24,25], andthe sparse representation classifier (SRC) [26,27].

However, five key factors including illumination variations,pose changes, facial expressions, age variations, and occlusionsignificantly affect the performance of face recognition systems.Among these factors, illumination variations such as shadows,

www.sciencedirect.com/science/journal/09252312

www.elsevier.com/locate/neucom

http://dx.doi.org/10.1016/j.neucom.2016.01.094



http://crossmark.crossref.org/dialog/?doi=10.1016/j.neucom.2016.01.094&domain=pdf



mailto:[email protected]


Fig. 1. Typical framework for illumination-invariant face recognition [1].

M.R. Faraji, X. Qi / Neurocomputing 199 (2016) 16–30 17

underexposure (too dark), and overexposure (too bright) attractedmuch attention in the last decade [28,1]. In the image of a familiarface, changing the direction of the light source leads to shifts in thelocation and shape of shadows, changes in highlights (i.e., makingthe face appear too bright or too dark), and reversal of contrast.Braje et al. [29] studied the effects of illumination in face recog-nition from the psychological perspectives. They found out thatillumination variations have enormously complex effects on theimage of an object. Specifically, varying the illumination directionresulted in larger image differences than did varying the identityof the face. Furthermore, cast shadows introduced spuriousluminance edges that may be confused with object contours andconsequently significantly hindered performance of face recogni-tion systems. As a result, there is a need for extracting facial fea-tures which are invariant or insensitive to illumination variations.These invariant facial features can then be used for recognizing theface. Various methods have been proposed to deal with illumi-nation variations. These methods can be categorized into the fol-lowing three groups: gray-level transformation methods, gradientor edge extraction methods, and reflection field estimationmethods [1,30].

Gray-level transformation methods redistribute the intensitiesin a face image with a linear or non-linear transformation functionto correct the uneven illumination to some extent [1]. They aresimple and computationally efficient and fast. But, they are noteffective in eliminating the effects of illumination and thereforeachieve an inferior performance than the methods in the othertwo groups. Representative methods include histogram equaliza-tion (HE) [31] and gamma intensity correction (GIC) [32]. HEadjusts and flattens the contrast of the image using the image'shistogram. GIC maps the image Iðx; yÞ to the image Gðx; yÞ usingαIðx; yÞ1=γ , where α is a gray-stretch parameter and γ is the Gammacoefficient.

Gradient or edge extraction methods produce illumination-insensitive representations by extracting gray-level gradients oredges from face images [1]. Representative methods include localbinary patterns (LBP) [33], local directional patterns (LDP) [34],enhanced LDP (EnLDP) [35], local directional number patterns(LDN) [36], eight local directional patterns (ELDP) [37], adaptivehomomorphic eight local directional patterns (AH-ELDP) [38],discriminant face descriptor (DFD) [39], local tetra patterns (LTrP)[40], enhanced center-symmetric local binary pattern (ECS-LBP)[41], and logarithmic fractal dimension (LFD) [30]. LBP takes Ppixels in a circle of radius R around each pixel and threshold thesepixels based on the value of the central pixel, where P and R areusually set to be 8 and 2 for face recognition applications,respectively. LDP, EnLDP, LDN, and ELDP produce eight directionaledge images using Kirsch compass masks and encode the direc-tional information to obtain noise and illumination-invariantrepresentations. AH-ELDP filters face images using the homo-morphic filter, enhances the filtered images using an interpolativefunction and then performs the ELDP method to produce final AH-ELDP images. The DFD method is an improvement of the LBPdescriptor. It is involved with three-step feature extraction tomaximize the appearance difference from different persons and

minimize the difference from the same person. LTrP encodes therelationship among the center pixel and its neighborhood pixelsusing derivatives in vertical and horizontal directions. LFD per-forms a log function on face images and transfers images to thefractal dimension domain to produce illumination-invariant facerepresentations.

Reflectance field estimation methods estimate the face reflec-tance field, which is illumination-invariant, from a face image. Theyusually apply the Lambertian reflectance-illumination model (i.e.,retinex theory) [1,42] as their face imaging model. These methodsare generally effective. However, the surface of the human face isapproximately Lambertian, which makes illumination-invariantrepresentation less sufficient [43]. Self-quotient image (SQI)[44,45], adaptive smoothing (AdaS) [46], Gradientface [47], Weber-face [48], generalized Weberface (GWF) [49], and local GradientfaceXOR and binary pattern (LGXBP) [50] are representative methods.SQI, which is a ratio image between a given test image and itssmoothed version, implicitly indicates that each preprocessed imageis illumination-invariant [51]. The AdaS method estimates the illu-mination by smoothing the input image using an iterative method.The Gradientface method creates a ratio image between the y-gra-dient and the x-gradient of a given image. The Weberface method,inspired by Weber's law and based on the Weber local descriptor(WLD) [52], creates a ratio image between the local intensity var-iation and the background. The Gradientface and Weberface meth-ods are proven to be illumination-insensitive. The GWF method is ageneralized multi-scale version of the Weberface method. It alsoconsiders an inner and outer ground for each pixel and assignsdifferent weights for them to develop a weighted GWF (wGWF). TheLGXBP method uses a histogram-based descriptor to produceillumination-invariant representation of a face image. It first trans-forms a face image into the logarithm domain and obtains a pair ofillumination-insensitive components: gradient-face orientation andgradient-face magnitude. These two components are then encodedby LBP and local XOR patterns to form the local descriptor and thehistogram for face recognition.

In this paper, we propose an effective method which encodesface images with various levels of illumination. To this end, wefirst filter images using the adaptive homomorphic filter to par-tially reduce the illumination. We then use the simplified loga-rithmic fractal dimension transformation as an edge enhancertechnique to enhance facial features and remove illumination tosome point. We finally produce eight edge directional imagesusing Kirsch compass masks and propose a complete ELDP(CELDP), which considers both directions and magnitudes of theedge responses, to obtain illumination-invariant representations.

The contributions of the proposed method are: (1) Using theadaptive homomorphic filter to partially reduce the illuminationby attenuating the low-frequency (i.e., illumination) component ofeach face image based on the characteristics of each face image.(2) Transforming filtered face images to the LFD domain by skip-ping every other adjacent scale (i.e., reducing the computationaltime by half) to enhance facial features such as eyes, eyebrows,nose, and mouth while keeping noise at a low level. (3) Employinga gradient-based descriptor which considers the relations among

M.R. Faraji, X. Qi / Neurocomputing 199 (2016) 16–3018

directions and magnitudes of all eight directional edge responsesto represent more valuable structural information from theneighborhood, capture the distinct characteristics of each faceimage, and therefore achieve robustness against illuminationvariations and noise. (4) Achieving the highest face recognitionaccuracy on five face databases using the same set of pre-determined parameters and a simple nearest neighbor-basedclassifier compared with nine recently proposed state-of-the-artmethods.

The rest of this paper is organized as follows: Section 2describes some of previous work and presents the proposedmethod. Section 3 presents experimental results on four publiclyavailable image databases as well as a standard video database andevaluates the verification and discrimination ability of the pro-posed method using ROC curves. Finally, Section 4 draws theconclusions and presents the directions of future work.

2. Methodology

In this section, we first describe some of the related methods inmore detail and then present the proposed method.

2.1. Previous work

The LDP-based approaches such as LDP, EnLDP, LDN, and ELDPoperate in the gradient domain to produce illumination-invariantrepresentations. They use directional information to produce edgeresponses which are insensitive to lighting variations [36]. Speci-fically, for each face image, eight directional edge images arecomputed by eight Kirsch compass masks ðM0;M1;…;M7Þ shownin Fig. 2. Eight masks can be produced by rotating the first Kirschmask (M0) 45° apart in eight directions. The ith directional edgeimage, Ii, is computed by the convolution of the face image Iðx; yÞand the corresponding mask Mi. Then, each LDP-based approachperforms a different operation to produce illumination-invariantimages. For example, for each pixel (x,y) of the image I, LDP [34]considers an 8-bit binary code and assigns 1's to the three bitscorresponding to the three prominent numbers and 0's to theother five bits. EnLDP [35] and LDN [36] consider a 6-bit binarycode. The first and second three bits respectively code the posi-tions of the top and second top positive directional numbers forEnLDP, and the positions of the top positive and top negativedirectional numbers for LDN. ELDP [37] considers an 8-bit binarycode and assigns 1's to the corresponding positive numbers and0's to the corresponding negative numbers. In this way, ELDP usesall the eight directional numbers and represents more valuablestructural information. However, it requires applying a Gaussianfilter to smooth images before performing the ELDP process. Thegenerated codes for all the LDP-based methods are then convertedto their corresponding decimal values to produce LDP, EnLDP, LDN,and ELDP images, which are illumination-invariant.

The recently proposed AH-ELDP [38] method combines threesteps to produce illumination-invariant images. It first applies an

Fig. 2. Kirsch compass masks in eight directions.

adaptive homomorphic filter on the original face image in thefrequency domain to attenuate the low frequency signal (i.e.,illuminance) and strengthen the high frequency signal (i.e., textureinformation) of the image. An adaptive c is used to adjust thehomomorphic filter for each face image based on the slope ofchanges in the low frequency components of the image and reducethe illumination influence. Next, AH-ELDP adopts the idea of theinterval type-2 fuzzy sets [53] to apply an interpolative enhance-ment function to stretch the contrast of the filtered image. Finally,similar to the ELDP method, it uses Kirsch compass masks tocompute edge responses and considers the relations among eightdirectional numbers to obtain the AH-ELDP image.

These LDP-based approaches are similar to LBP from one per-spective since the LBP method also transfers the 8-bit binary codeto a decimal value. However, LDP approaches compute the eightdirectional edge images using Kirsch compass masks and use thedirectional numbers to generate the 6-bit or 8-bit code, accord-ingly. LBP, on the other hand, uses the center pixel of the originalimage in each neighborhood as the threshold and computes thesparse points to generate the 8-bit code. Therefore, LDP approa-ches use more edge information from the entire neighborhood,while LBP discards most of the information in the neighborhoodby using few numbers of intensities in a neighborhood, whichmakes the LBP method sensitive to noise and limits its accuracy[38,36].

The fractal analysis (FA) as a type of texture analysis has beenrecently used in medical imaging [54–56], signal processing [57],and also in face recognition under varying illuminations [30].Specifically, the recently proposed LFD method [30] performs a logfunction on a face image to expand dark pixels and compressbright pixels for partial illumination reduction. It then transfersthe face image to the fractal dimension (FD) domain to producethe illumination-invariant representation by using the differentialbox-counting (DBC) algorithm, a popular computationally efficientmethod for dealing with large images.

2.2. The proposed method

The proposed method consists of three steps. The first step is touse the adaptive homomorphic filter to partially reduce the illu-mination. The second step is to apply the simplified LFD method toenhance facial features. The last step employs a complete ELDP(CELDP), which uses both directions and magnitudes of edgeresponses to produce the illumination-invariant representation.Fig. 3 illustrates the framework of the proposed method to pro-duce final matching scores for each face image.

2.2.1. Adaptive homomorphic filterIn the first step, we apply the adaptive homomorphic filter as a

Gaussian high-pass filter on a face image to strengthen the high-frequency signal and attenuate the low-frequency signal. A faceimage I can be represented based on the Lambertian-reflectancemodel so that Iðx; yÞ ¼ Rðx; yÞLðx; yÞ, where Iðx; yÞ, Rðx; yÞ, and Lðx; yÞare respectively the pixel intensity, reflectance, and illuminance atposition (x,y) of the face image [38,58]. R corresponds to the tex-ture information such as the contrast arrangement of skin, eye-brows, eyes, and lips of a face and is therefore considered as thehigh frequency signal. L contains the illumination values ofneighboring pixels which are similar to each other and is thereforeconsidered as the low frequency signal [59,38]. The process ofapplying the filter is summarized in Fig. 4. It indicates that thelogarithmic function is used to separate the R and L functions sothat the operation can be performed in the frequency domain. TheFourier transform result is then multiplied with the adaptivehomomorphic filter to reduce the illuminance. The filtered image


is obtained by the exponential function of the inverse Fouriertransform.

The homomorphic filter Hðu; υÞ is represented as follows:

Hðu; υÞ ¼ ðγH�γLÞ 1�exp� cD2ðu;υÞ=D20

h iþγL ð1Þ

where Dðu; υÞ is the distance from (u, υ) to the origin of the cen-tered Fourier transform, D0, is the cutoff distance measured fromthe origin, γLo1 and γH41 are the parameters of the filter, and cis a constant to control the sharpness of the slope of the filterfunction as it transitions between γH and γL. Since c along with D0

(i.e., D0=ffiffiffic

p) determines the actual cutoff distance, it is considered

as the most important parameter of the filter [38]. As proposed in

Fig. 3. The framework of the proposed method.

Fig. 4. Homomorphic filtering approach to partially

[38], the c parameter is computed by:

c¼Mag1Mag2

ð2Þ

where Magi indicates the ith largest magnitude value (i.e., thestrength of the frequency components) at non-zero frequencylocations within a square window centering at the origin of theFourier transform.

2.2.2. Logarithmic fractal dimensionIn the second step, we perform the LFD method on top of the

filtered image. The LFD method, first, applies a log function on thefiltered image to expand the values of dark pixels and compressthe values of bright pixels. Then, it transfers the obtained image tothe FD domain using the DBC algorithm [30]. To this end, LFDcomputes a sequence of two-dimensional matrices Mat' to overlaythe filtered image f of size M � N at each pixel (x, y) as follows:

Matd�1ðx; yÞ ¼ floormaxðWdðx; yÞÞ�minðWdðx; yÞÞ

d

� �þ1

� �dmax

d

� �2

ð3Þwhere d is the scaling factor with a minimum value of 2 and amaximum value of dmax that represents how much a specificstructure of pixels is self-similar to its surroundings. Wdðx; yÞ is ad� d neighborhood block whose center is at location (x, y) of animage. When d is an odd number, the center is in the middle of theneighborhood block. When d is an even number, the center is tothe left and above the middle of the neighborhood block. max andmin operations obtain the highest and lowest intensity values inthe image block covered by the d� d neighborhood block cen-tering at the location (x, y) of an image, respectively. Specifically,for each scaling factor d, the center of a sliding neighborhood blockof size d� d is positioned at each pixel in an image. The maximumintensity value and the minimum intensity value of the imageportion covered by the sliding neighborhood block are used tocompute the new value at the corresponding pixel location fol-lowing Eq. (3).

Each two-dimensional matrix Matd�1 of size M�N is thenconverted to a one-dimensional column vector of size ðM � NÞ � 1and the converted ðdmax�1Þ column vectors are concatenated toobtain the two-dimensional matrix V, which has ðM � NÞ rows andðdmax�1Þ columns. Each row in the matrix V contains the valuescomputed by Eq. (3) at a particular location in an image acrossscales from 2 to dmax. For example, vðx;yÞ denotes the obtainedvalues at location (x, y) of an image across scales from 2 to dmax. Alog function is then applied on all the elements of V and therespective scaling factor d. The LFD image is finally computed asfollows:

LFDðx; yÞ ¼ Sυðx; yÞS

ð4Þ

where S and Sυ are the sums of squares as follows:

S¼Xdmax �1

d ¼ 1

ðlog ðdÞÞ2�Pdmax �1

d ¼ 1 log ðdÞ� �2

dmax�1ð5Þ

Sυðx; yÞ ¼Xdmax �1

d ¼ 1

log ðdÞðlog ðυðx;yÞ;dÞÞ�Pdmax �1

d ¼ 1 log ðdÞ� � Pdmax �1

d ¼ 1 log ðυðx;yÞ;dÞ� �

dmax�1

ð6Þ

reduce the illuminance of the face image Iðx; yÞ.

Fig. 6. CELDPM code computation using eight magnitudes thresholded by theaverage of absolute values of all the eight edge images (i.e., 83).


The original LFD method [30] sets dmax to be 10 and recom-mends to set dmax to be between 9 and 12 for face images. How-ever, we observe that the structure of pixels in a face imagechanges slowly across scales. Therefore, skipping every otheradjacent scale does not lose much information and we propose toset d¼ 2;4;6;…; dmax. In this way, the number of overlay two-dimensional matrices Mat (i.e., the column dimension of thematrix V) decreases to a half (i.e., dmax=2) and these targeted scalesare involved in the computation in Eqs. (5) and (6). Obviously, thisreduces computational time to transfer an image to its FD imageby a half and therefore reduces the run time of the methodsignificantly.

2.2.3. Complete ELDP (CELDP)The LFD image obtained from the adaptive homomorphic fil-

tered image is fed into the ELDP process. To this end, we computeeight edge directional images using Kirsch compass masks toproduce the ELDP image. Given a pixel (x, y) in the LFD image, ifIiðx; yÞ shows the ith (i¼0,1,…,7) directional number computed bythe ith Kirsch compass mask Mi, the respective ELDP value at thepixel (x, y) is defined as follows:

ELDPðx; yÞ ¼Xi ¼ 7

i ¼ 0

DðIiðx; yÞÞð2Þi ð7Þ

where

DðzÞ ¼ 1 z400 otherwise

Fig. 5 shows the process of producing an ELDP value of a pixelusing its eight directional numbers. Their directions (i.e., positiveand negative) are used to obtain the ELDP image, while theirrespective magnitudes are discarded. We notice that a directionalnumber with a higher value is expected to contain more infor-mation about the texture changes in its respective direction.Therefore, we propose to consider and integrate the influence ofthe magnitudes of the directional numbers in the ELDP process. Tothis end, we adopt the idea of the completed LBP (CLBP) method[60], which considers magnitudes of difference between theneighborhood pixels and the central pixel as well as their corre-sponding signs, to design the proposed CELDP. We define CELDP asthe composition of direction component (CELDPD) and magnitudecomponent (CELDPM), which respectively are defined as follows:

CELDPDðx; yÞ ¼ ELDPðx; yÞ ð8Þ

CELDPMðx; yÞ ¼Xi ¼ 7

i ¼ 0

MðabsðIiðx; yÞÞ;mÞð2Þi ð9Þ

where

Mðz;mÞ ¼ 1 z4m

0 otherwise

and m is the average of absolute values of all the eight edgedirectional images. Basically, CELDPD is the traditional ELDP andcontains the information of the directions of edge responses.

Fig. 5. CELDPD (i.e., ELDP) code computation using the signs of eight directionalnumbers computed from eight Kirsch compass masks for a pixel.

CELDPM has a similar 8-bit binary code structure. However, itthresholds the absolute value of all eight magnitude values foreach pixel based on the average of absolute values of all the eightedge images (i.e., m). Fig. 6 shows the process of producing anCELDPM value of a pixel using its eight magnitudes.

Fig. 7 illustrates the intermediate results of the proposedmethod together with the final CELDPD and CELDPM images for aninput face image. Clearly, both CELDPD and CELDPM images containimportant features which are illumination-invariant.

2.2.4. ClassificationSimilar to the Weberface, AH-ELDP, and LFD methods, the one

nearest neighbor (1NN) method with l2 norm (1NN-l2) is used asthe classifier. 1NN-l2 simply assigns an input image to its nearestneighbor image in the database. Therefore, recognition accuracyonly shows the effectiveness of preprocessing methods in produ-cing illumination-invariant features. To utilize the CELDPD andCELDPM images of each input image for the classification task, wefirst compute similarity distances DisD and DisM between theCELDPD and CELDPM images of the input image and their corre-sponding CELDPD and CELDPM preprocessed images for all imagesin the database, respectively. We then independently normalize alldistances DisD and DisM to be in the range of [0, 1]. We denote thenormalized distances as NDisD and NDisM, respectively. The finalsimilarity measure Dis is computed by fusing NDisD and NDisM asfollows:

Dis¼ α � NDisDþð1�αÞ � NDisM ð10Þwhere α is the weight of the direction component of the edgeimages at the decision level. We set α to be 0.80, as Guo et al. [60]proved that the local difference error made by using the signcomponent of the LBP code to reconstruct an image is only 1/4 ofthe respective error made by the magnitude component of the LBPcode. This valid argument is used in our system to compute thefinal similarity measure as a proportion of similarity measuresobtained from CELDPD and CELDPM components.

3. Experimental results

3.1. Experimental settings

We evaluate the proposed method by conducting experimentson four publicly available image databases with large illuminationvariations, namely, Yale B, extended Yale B, PIE, and AR imagedatabases [61–63] as well as the Honda UCSD video database [64].We use the recognition accuracy, which is computed as the per-centage of the number of correctly recognized faces to the numberof all probe (testing) images, as the performance measure. We alsouse ROC curves to illustrate the ability of each method to verifyand discriminate face images.

We compare the proposed method with nine recently proposedstate-of-the-art methods such as Gradientface, Weberface, LFD,LBP, LDP, EnLDP, LDN, ELDP, and AH-ELDP. Fig. 8 demonstrates sixoriginal images from the Yale B face database and their

Fig. 7. Illustration of the intermediate results of the proposed method.

Fig. 8. Comparison of preprocessed images obtained by different methods. (a) Original face images in S0 of the Yale B face database. Illumination-invariant images producedby the methods of (b) Gradientface, (c) Weberface, (d) LBP, (e) LDP, (f) EnLDP, (g) LDN, (h) LFD, (i) ELDP, (j) AH-ELDP, (k) CELDPD, and (l) CELDPM.


corresponding preprocessed images generated by nine comparedstate-of-the-art methods and the CELDPD and CELDPM imagesproduced by the proposed method. All of these methods areimplemented in MATLAB and their parameters are set as recom-mended by the respective researchers. For the LBP method, weapply the uniform LBP operator with 8 pixels in a circle of radius of2. We also use the Gaussian filter with the Standard Deviation (SD)

of 0.3 to smooth the image for Gradientface and ELDP methodsand use the Gaussian filter with the SD of 1 for Weberface method,since the above methods achieve decent accuracy for all fivedatabases using these values. Also, similar to [38,30], the para-meters of the proposed system are set to be: γH¼1.1, γL¼0.5,D0¼15, and dmax¼10. However, when images are of size 64�64,we set D0 to be proportional to its size (i.e., D0¼64�0.15¼9.6).


Results on the Yale B face database: The Yale B database containsgrayscale face images of 10 subjects under nine poses. We usefrontal face images in our experiments. Each subject has 64 images(480�640 pixels). All face images are cropped and resized to100�100 pixels. These images are categorized into six subsetsbased on the angle between the light source directions and thecentral camera axis: S0 (0°, 6 images per subject), S1 (1–12°,8 images per subject), S2 (13–25°, 10 images per subject), S3 (26–50°, 12 images per subject), S4 (51–77°, 10 images per subject), andS5 (above 78°, 18 images per subject). S0 contains 6 images persubject, which are shown in the first column of Fig. 8. These6 images, respectively, have �35°, �20°, 0°, 20°, 45°, and 90° ofelevation of the light source. Positive elevation implies that thelight source is above the horizon, while negative elevation impliesthat the light source is below the horizon. Similar to experimentalsettings in [38,30], we conduct six experiments to evaluate theperformance of each compared method. In the ith experiment, weuse the ith image from each subject in S0 as the reference(training) image and the remaining five images from each subjectin S0 (50 images in total) and all the images in each of the otherfive subsets as probe (testing) images. In each experiment, theremaining five probe images for each subject in S0 are formed as anew subset S0’ with 50 images. Then, for each experiment, wecompute the recognition accuracy for each of the six subsets (i.e.,S0', S1, S2, S3, S4, and S5). Next, the average face recognitionaccuracy of each subset across all the six experiments is computedand, considered as the recognition accuracy for the respectivesubset. We also report the standard deviation of the recognitionaccuracy across all the six experiments for each subset. Finally, wecompute the average and standard deviation of the face recogni-tion accuracy across all the six subsets. Table 1 summarizes theaverage recognition accuracy and its SD of each subset for the ninestate-of-the-art compared methods and the proposed method. We

Table 1Recognition accuracy (%) and corresponding SD in parentheses for Yale B face images.

Method S00 S1 S2

Gradientface 83.67 (0.10) 95.62 (0.06) 90.83 (0.09)Weberface 87.33 (0.12) 93.96 (0.09) 90.17 (0.12)LBP 83.33 (0.08) 96.25 (0.05) 91.83 (0.09)LDP 81.67 (0.12) 92.71 (0.10) 86.67 (0.17)EnLDP 79.67 (0.11) 92.92 (0.10) 87.00 (0.13)LDN 77.00 (0.13) 91.25 (0.13) 84.50 (0.16)LFD 92.33 (0.13) 95.83 (0.10) 93.50 (0.13)ELDP 83.33 (0.12) 95.42 (0.06) 89.83 (0.10)AH-ELDP 93.67 (0.06) 99.58 (0.01) 96.50 (0.05)CELDPM 92.00 (0.10) 96.04 (0.07) 93.67 (0.10)CELDPD 99.33 (0.01) 100.00 (00) 99.67 (0.01)CELDP 99.33 (0.01) 100.00 (00) 99.50 (0.01)

Table 2Recognition accuracy (%) and corresponding SD in parentheses for extended Yale B face

Method S00 S1 S2

Gradientface 68.25 (0.15) 86.54 (0.19) 79.69 (0.21)Weberface 71.84 (0.17) 84.38 (0.15) 78.55 (0.21)LBP 63.07 (0.17) 82.45 (0.18) 76.32 (0.24)LDP 67.89 (0.18) 84.60 (0.23) 77.10 (0.25)EnLDP 65.61 (0.19) 86.21 (0.19) 77.41 (0.21)LDN 61.58 (0.21) 83.17 (0.24) 74.30 (0.26)LFD 79.30 (0.23) 85.88 (0.24) 84.82 (0.26)ELDP 70.53 (0.13) 87.76 (0.11) 81.14 (0.14)AH-ELDP 77.46 (0.10) 88.26 (0.05) 84.82 (0.12)CELDPM 79.30 (0.20) 86.38 (0.21) 81.32 (0.23)CELDPD 90.96 (0.04) 93.63 (0.02) 94.34 (0.05)CELDP 92.46 (0.06) 94.07 (0.03) 94.34 (0.06)

include the average accuracy together with its SD for all subsetsincluding S00, S1, S2, S3, S4, and S5 (i.e., 630 testing images in total)in the last column of Table 1. The proposed method outperformsthe nine state-of-the-art methods with the best face recognitionaccuracy of 99.47% and the smallest SD of 0.004, while the secondhighest accuracy of 96.67% is obtained by the AH-ELDP method. Itimproves the face recognition accuracy of Gradientface, Weber-face, LBP, LDP, EnLDP, LDN, LFD, ELDP, and AH-ELDP by 9.18%,7.52%, 10.92%, 15.69%, 15.73%, 20.72%, 4.56%, 8.85%, and 2.90%,respectively.

We also include the performance of the two variant systems,namely, CELDPM (i.e., CELDP with only magnitude information)and CELDPD (i.e., CELDP with only directional information), in thetenth and eleventh rows of Table 1. It is shown that the CELDPDmethod achieves the second best recognition accuracy, while theCELDPM method achieves the fifth best accuracy. This is expectedand confirms the claim that the direction contains more importantinformation in edge response images than magnitude [60].

Results on the extended Yale B face database: The extended YaleB database contains grayscale face images of 38 subjects undernine poses. We use frontal face images in our experiments. Eachsubject has 64 images (2414 images out of 2432 images are usedsince 18 images are corrupted). All face images are cropped andresized to 100�100 pixels. These frontal face images are cate-gorized into six subsets based on the angle between the lightsource directions and the central camera axis: S0 (0°, 228 images),S1 (1–12°, 301 images), S2 (13–25°, 380 images), S3 (26–50°, 449images), S4 (51–77°, 380 images), and S5 (above 78°, 676 images).S0 contains 6 images per subject, which are similar to S0 in theYale B database and have �35°, �20°, 0°, 20°, 45°, and 90° ofelevation of the light source. We conduct six experiments for eachcompared method using the same experimental settings with theYale B database. Table 2 summarizes the average recognition

S3 S4 S5 Ave.

84.86 (0.10) 91.00 (0.09) 95.56 (0.05) 91.11 (0.06)89.17 (0.06) 89.83 (0.08) 98.33 (0.01) 92.51 (0.06)85.28 (0.07) 88.50 (0.07) 90.93 (0.05) 89.68 (0.04)87.22 (0.03) 84.00 (0.03) 84.07 (0.08) 85.98 (0.03)81.81 (0.10) 85.33 (0.12) 87.13 (0.09) 85.95 (0.05)80.14 (0.08) 81.33 (0.11) 80.93 (0.10) 82.40 (0.05)93.06 (0.10) 95.67 (0.06) 97.59 (0.03) 95.13 (0.08)84.86 (0.09) 90.50 (0.10) 95.46 (0.05) 91.38 (0.06)94.31 (0.06) 96.17 (0.04) 98.15 (0.02) 96.67 (0.03)94.58 (0.06) 94.17 (0.07) 96.85 (0.03) 95.00 (0.06)98.47 (0.01) 99.50 (0.01) 99.44 (0.00) 99.36 (0.004)99.03 (0.01) 99.67 (0.01) 99.44 (0.00) 99.47 (0.004)

images.

S3 S4 S5 Ave.

73.05 (0.07) 77.59 (0.13) 82.17 (0.13) 78.76 (0.07)77.43 (0.06) 78.68 (0.12) 88.24 (0.05) 81.32 (0.09)67.52 (0.07) 72.54 (0.10) 75.17 (0.11) 73.44 (0.07)73.31 (0.11) 65.96 (0.06) 64.32 (0.11) 71.18 (0.08)68.93 (0.07) 71.58 (0.13) 71.23 (0.17) 73.29 (0.08)65.07 (0.09) 66.27 (0.10) 64.08 (0.16) 68.47 (0.09)83.03 (0.21) 83.33 (0.17) 81.21 (0.06) 82.91 (0.17)72.94 (0.09) 78.25 (0.15) 83.70 (0.13) 79.85 (0.09)81.89 (0.06) 86.32 (0.11) 89.79 (0.06) 85.77 (0.05)82.52 (0.13) 83.33 (0.17) 84.96 (0.07) 83.38 (0.15)92.69 (0.03) 95.26 (0.03) 93.96 (0.02) 93.71 (0.02)93.98 (0.03) 96.05 (0.03) 95.02 (0.01) 94.55 (0.02)


accuracy and its SD of each subset for the nine state-of-the-artcompared methods and the proposed method. S00 for eachexperiment contains the images in S0 excluding the trainingimage. We also include the average accuracy and its SD for S00, S1,S2, S3, S4, and S5 (i.e., 2376 testing images in total) in the lastcolumn of Table 2. The proposed method outperforms the ninecompared methods with the best face recognition accuracy of94.55% and the smallest SD of 0.02, while the second highestaccuracy of 85.77% is obtained by the AH-ELDP method. Itimproves the face recognition accuracy of Gradientface, Weber-face, LBP, LDP, EnLDP, LDN, LFD, ELDP, and AH-ELDP by 20.05%,16.27%, 28.74%, 32.83%, 29.01%, 38.09%, 14.04%, 18.41%, and 10.24%,respectively.

Similar to Yale B database, we include the results for CELDPMand CELDPD in Table 2 to confirm that the direction containsmore important information in edge response images thanmagnitude.

Results on the PIE face database: The PIE database contains41,368 grayscale images (486�640 pixels) from 68 subjects. Theyare captured under various poses, illuminations, and expressions.Frontal images from the illumination subset (C27) are used in ourexperiments. This subset contains 21 images per subject. All faceimages are cropped and resized to 100�100 pixels. Fig. 9 showsall 21 images for a subject from this database and their corre-sponding CELDPD and CELDPM images. We conduct 21 experi-ments to evaluate the performance of the proposed methods. Inthe ith experiment, we use the ith image from each subject as thereference (training) image and all the other 20 images as the probe(testing) images. Fig. 10 shows the recognition accuracy of differ-ent methods under each reference set. Obviously, the proposedmethod outperforms the other methods for all reference imagesexcept for the AH-ELDP method which achieves comparableaccuracy as the proposed method.

Table 3 summarizes the average recognition accuracy and itsSD of the ten compared methods for all reference images. Theproposed method achieves the highest average recognitionaccuracy of 99.53% and the smallest SD of 0.01, while the secondhighest accuracy of 99.45% is obtained by the AH-ELDP method.Compared with the other nine methods, the proposed methodimproves the face recognition accuracy of Gradientface,Weberface, LBP, LDP, EnLDP, LDN, LFD, ELDP, and AH-ELDP by2.46%, 5.10%, 4.36%, 19.31%, 6.10%, 10.05%, 1.71%, :60% and :08%,respectively. It clearly shows the effectiveness of the proposedmethod.

The obtained accuracy for our two variant systems (i.e., theCELDPM and CELDPD) in Table 3 indicates that CELDP improves theperformance of both variations and outperforms all the comparedmethods. In addition, CELDPD and CELDPM achieve the third andfourth accuracy, respectively.

Results on the AR face database: The AR database, collected atOhio State University, contains over 4000 color images corre-sponding to 126 people's faces (70 men and 56 women). Theyare frontal view faces with different facial expressions, illumi-nation conditions, and occlusions (sun glasses and scarf) [63].Each person participated in two sessions, separated by a two-week (14 days) time period. The same pictures were taken inboth sessions. We use the cropped version of the database [65],which contains 100 subjects (50 men and 50 women) of the sizeof 165�120 pixels. However, we still crop the faces to be similarto the other databases that we used. We only use images withillumination conditions and neutral expressions. They includetwo images with neutral expressions, two images with lightfrom the left, two images with light from the right, and twoimages with light from both sides for each subject. Since theoriginal size is small, all images are resized to 64� 64 pixels.This also helps us to evaluate the robustness of parameters in

the proposed method with different sized images. Fig. 11 showsall eight images for a subject from the AR database. We conducteight experiments to evaluate the performance of the proposedmethod. In the ith experiment, we use the ith image from eachsubject as the reference image and all the other seven images asthe testing images.

Table 4 summarizes the average recognition accuracy of the tencompared methods for all reference images. The proposed CELDPmethod achieves the highest average recognition accuracy of86.63%, while the second highest accuracy of 85.95% is obtained bythe AH-ELDP method. Compared with the other nine methods, theproposed CELDP method improves the face recognition accuracy ofGradientface, Weberface, LBP, LDP, EnLDP, LDN, LFD, ELDP, and AH-ELDP by 19.49%, 27.47%, 14.20%, 12.51%, 1.64%, 4.44%, 1.59%, 1.26%and 0:08%, respectively.

The obtained accuracy for our two variant systems in Table 4indicates that CELDP improves the performance of both variationsand outperforms all the compared methods. This also clearlyverifies the effectiveness of incorporating direction and magnitudeof edge responses to achieve significant better results whileCELDPD and CELDPM alone yield inferior results.

Results on the Honda UCSD video database: The Honda UCSD videodatabase is a standard video database for evaluating face tracking/recognition algorithms [64]. Each video sequence is recorded in anindoor environment at 15 frames per second, and each lasts for atleast 15 s. The resolution of each video sequence is 640�480. Eachindividual is recorded in at least two video sequences. All the videosequences contain significant 2-D (in-plane) and 3-D (out-of-plane)head rotations. In each video, the person rotates and turns his/herhead in his/her own preferred order and speed. Typically in about15 s, the individual is able to provide a wide range of different poses.However, some of these sequences contain difficult events such aspartial occlusion, face partly leaving the field of view, and large scalechanges. The database contains two datasets corresponding to thetwo video sequences. We use training and testing subsets from thefirst dataset, which respectively contain 20 videos for training and 39videos for testing. The training videos belong to 20 human subjects,while the testing videos contain 19 individuals of these subjects. Weexclude the subject that is not in the testing set in our experiments.

In this research, the Viola–Jones face detector [4] is used todetect faces in each frame. It is a robust real-time face detector todistinguish faces from non-faces. Also, in our experiments, we usethe most recently proposed facial landmark localization technique,Chehra v.1 [9], to localize the landmarks in each face. Chehra is fullyautomatic real-time face and eyes landmark detection and track-ing software capable of handling faces under uncontrolled naturalsettings. It determines 49 landmarks for each face. Fig. 12 shows avideo frame from the Honda database together with the facedetection and landmark localization results, where the detectedface is positioned in the green bounding box and its 49 landmarksare marked by “n”.

Illumination-invariant methods require the face images to be inthe frontal view and the same size. Therefore, a series of pre-processing steps should be applied. Since the bounding box for adetected face contains lots of non-facial features, especially thebackground, each face needs to be cropped. If all faces are in thefrontal view similar to the frontal view face shown in Fig. 12, weonly need to crop and resize all faces. We can use the landmarks,i.e., the location of eyebrows, left and right corners of the left andright eyes (i.e., external corners), and the lower part of mouth, toautomatically crop the face. Fig. 13 shows the cropped version ofthe detected face in Fig. 12.

However, in the Honda UCSD video database, most of thefaces are in non-frontal views. Fig. 14(a) shows a typical detec-ted face in the non-frontal view and its landmarks. Moreoperations should be applied to normalize the pose and

Fig. 9. Illustration of sample images of the CMU-PIE database and their CELDP images. (a) 21 samples for a subject. (b) Corresponding CELDPD images. (c) CorrespondingCELDPM images.


estimate the frontal view so the processed face can be used inthe proposed system as well as other compared systems. Here,we use landmarks for different tasks. To this end, we first

compute the roll value for each detected face. Roll is defined asthe angle between the horizontal line and the line connectingthe external corners of the eyes. It is shown in Fig. 14(b). We


then determine whether the face is in the frontal/right/left view.As shown in Fig. 14(c), y1 is computed as the distance from thenose tip to the external corner of the right eye, and y2 is com-puted as the distance from the nose tip to the external corner ofthe left eye. If y1 is equal to y2, the face is in the frontal view. If y1

Fig. 10. Comparison of the recognition accuracy of ten methods for PIE face images.

Table 3Average recognition accuracy (%) and correspond-ing SD in parentheses for PIE face images.

Methods Accuracy

Gradientface 97.14 (0.03)Weberface 94.70 (0.06)LBP 95.37 (0.04)LDP 83.42 (0.10)EnLDP 93.81 (0.07)LDN 90.44 (0.09)LFD 97.86 (0.02)ELDP 98.91 (0.01)AH-ELDP 99.45 (0.01)CELDPM 96.25 (0.04)CELDPD 99.29 (0.01)CELDP 99.53 (0.01)

Fig. 11. Eight images with light changes and neutral e

is greater than y2, the face is in the right view. Otherwise, it is inthe left view.

Next, we rotate the face image using the computed roll value.We also need to rotate the landmarks to their new locations.Fig. 14(d) shows the result after rotating the face and its respectivelandmarks. In the proposed system, we assume the face is in theleft view before we estimate its frontal view. Therefore, if the faceis in the right view (i.e., y14y2), we flip the face to obtain itsrespective left view. Fig. 14(f) shows the flipped face for Fig. 14(e),which is now in the left view. Then, we appropriately stretch eachrow from the left side of the face to estimate the frontal view forthe half of the face. The stretching operation needs to keepimportant regions of a face in an appropriate scale with respect toeach other. Therefore, we divide the half face to four verticalbands. The boundary of each band is determined using specificlandmarks or midpoint between a pair of specific landmarks. Forevery row in the cropped image, its respective four segments inthe four bands are stretched to have an equal size to produce theestimated frontal view of the half face. The bands and theemployed landmarks to estimate the frontal view of the half facetogether with the estimated frontal view are shown in Fig. 14(g).Finally, the right view of the face is reconstructed by flipping the

xpression from a subject in the AR face database.

Table 4Average recognition accuracy (%) and correspond-ing SD in parentheses for AR face images.

Methods Accuracy

Gradientface 72.50 (0.03)Weberface 67.96 (0.03)LBP 75.86 (0.03)LDP 77.00 (0.01)EnLDP 85.23 (0.01)LDN 82.95 (0.02)LFD 85.27 (0.02)ELDP 85.55 (0.01)AH-ELDP 85.95 (0.01)CELDPM 80.07 (0.02)CELDPD 83.76 (0.01)CELDP 86.63 (0.01)

Fig. 13. Cropped face for the detected face

Fig. 14. Pose normalization process. (a) A detected face and its respective landmarks; (broll value (here, r¼9.52°); (c) distances between the external corners of eyes and the nosmeans it is in the right view); (d) the rotated face using the roll value and its respectivelandmarks; (f) the cropped face is flipped; (g) the estimated frontal view of the left sidethe face.

Fig. 12. A detected face from the Honda database. Each of 49 landmarks extractedby Chehra v.1 is represented by “n”.


estimated frontal left view. The final estimated frontal view of theface is shown in Fig. 14(h).

After estimating the frontal view for each face, all face ima-ges are resized to the size of 64�64. In this experiment, we useevery other ten frame to evaluate the performance of comparedmethods and exclude other frames due to the large number offrames. As a result, there are 42 face images for each subject inthis database on average. We can then conduct the experimenton this reduced database. For the Honda UCSD video database,we use five estimated frontal views of each subject from thetraining subset for training and use estimated frontal viewsfrom the testing subset for testing. Since subjects have facialexpressions, we apply Gabor filters as a post-processing step todeal with expression variations. We also use the SRC [26,27] asthe classifier since more than one image are used for training.The SRC classifier takes advantage of sparse representation toachieve better performance when each subject has multipleimages in training.

in Fig. 12 using the facial landmarks.

) angle between the external corners of eyes and the horizontal axis determine thee tip determine whether the face is in the left view or in the right view (e.g., y14y2landmarks which are also rotated to the new locations; (e) cropped face using theof the face; and (h) the right side of the face is reconstructed using the left side of

Fig. 15. ROC curves for the Yale B database.

Fig. 16. ROC curves for the extended Yale B database.

Table 5Recognition accuracy (%) for Honda UCSD videoface images.

Methods Accuracy

Gradientface 71.72Weberface 74.49LBP 74.51LDP 60.35EnLDP 72.73LDN 70.33LFD 75.13ELDP 75.15AH-ELDP 71.84CELDPM 68.16CELDPD 75.53CELDP 76.89


Table 5 summarizes the recognition accuracy of the tencompared methods. The proposed CELDP method achieves thehighest average recognition accuracy of 76.89%, while thesecond highest accuracy of 75.15% is obtained by the ELDPmethod. Compared with the other nine methods, the proposedCELDP method improves the face recognition accuracy ofGradientface, Weberface, LBP, LDP, EnLDP, LDN, LFD, ELDP, andAH-ELDP by 7.21%, 3.22%, 3.19%, 27.41%, 5.72%, 9.33%, 2.34%,2.31% and 7.03%, respectively. Comparing with the accuracy ofthe two variant systems, the CELDP method improves theperformance of both variations to outperform all the comparedmethods.

It should be mentioned that the compared methods behavedifferently on the Yale B, extended Yale B, CMU-PIE, AR, andHonda UCSD databases. For example, the Weberface methodachieves the fourth best accuracy of 92.51% and 81.32% on theYale B and extended Yale B databases, the seventh best accuracyof 94.70% on the CMU-PIE database, the tenth best accuracy of67.96% on the AR database, and the fifth best accuracy of 74.49%on the Honda UCSD database, respectively. LFD achieves thethird best accuracy on Yale B, extended Yale B, and Honda UCSDdatabases, while it achieves the fourth best accuracy on AR andCMU-PIE databases. Clearly, the proposed CELDP consistentlyoutperforms all the other methods by achieving the highestrecognition accuracy on all these databases with differentilluminations.

On the other hand, the comparison of the performance of ourtwo variant systems with the nine state-of-the-art methods onfive databases indicates that the CELDPD method outperforms allnine compared methods for Yale B, Extended Yale B, and HondaUCSD databases and yields comparable results for the PIE andAR databases. However, the CELDPM method only yields com-parable performance for the five databases. This performancedifference between CELDPD and CELDPM is expected, since thedirection contains more important information than themagnitude.

3.2. ROC curves and computational time

The ROC curve is a graphical plot to illustrate the verification anddiscrimination ability of a binary classifier. It plots the true positiverate (TPR) against the false positive rate (FPR) of the classifier. Aclassifier or method with higher TPR and lower FPR indicates abetter verification and discrimination ability. Since we use the sameclassifier for all methods, the ROC curve can illustrate the ability ofeach method to verify and discriminate face images. Here, a positive

match indicates an input image for a subject is determined to be thesame subject as the reference (training) image.

We plot ROC curves for the four databases. For Yale B andextended Yale B databases, we use each of the six images in S0 asthe reference image and all the rest as probe images and computethe average of respective similarity measures as true and falsescores. In a similar setting, we use each of the 21 images for eachsubject in the CMU-PIE database as the reference image and all therest as probe images and compute the average of respectivesimilarity measures as true and false scores. Similarly, we use eachof the eight images for each subject in the AR database as thereference image and all the rest as probe images and compute theaverage of respective similarity measures as true and false scores.These scores are then used to compute the TPR and the FPR indealing with each reference image. Finally, the TPR and FPR foreach database can be easily computed as the average of all the TPRand FPR values across all experiments over different referenceimages.

For the Honda UCSD video database, since we use multipleimages from each subject for training and conduct one experimentusing the remaining images as testing images, we compute the

Fig. 18. ROC curves for the AR database.

Fig. 19. ROC curves for the Honda UCSD video database.

Fig. 20. Average computational run-time for a face image (100�100).

Fig. 17. ROC curves for the PIE database.


average of similarity measures between each probe image fromsubject i and training images for the subject i as true scores. Wethen compute the average of similarity measures between eachprobe image from subject i and training images for all the subjectsexcluding the subject i as false scores.

Figs. 15–19 compare the respective ROC curves of ten comparedmethods for Yale B, extended Yale B, CMU-PIE, AR, and HondaUCSD databases, respectively. Clearly, the proposed CELDP methodachieves the best verification and discrimination ability comparedwith the other state-of-the-art methods since it has the largestarea under the ROC curve.

The proposed method also has a comparable computationaltime with the compared state-of-the-art methods. Fig. 20 showsaverage computational run-time to preprocess a face image ofthe size of 100�100 and produce the illumination-invariantfeatures from the preprocessed image. The LFD method has thehighest computational time, which is also employed in theproposed system. However, we efficiently simplify the methodby skipping every other adjacent scale while using the samedmax of 10.

4. Conclusions

The paper proposes a method to produce illumination-invariant representation for face images using logarithmic fractaldimension-based complete eight local directional patterns. Theproposed method uses the adaptive homomorphic filter to par-tially reduce the illumination by attenuating the low-frequencycomponent of the face image. It then applies the simplified loga-rithmic fractal dimension operation as an edge enhancer techni-que to enhance facial features such as eyes, eyebrows, nose, andmouth while keeping noise at a low level. Finally, the proposedmethod achieves robustness against illumination variations andnoise using CELDP, which not only considers the relations amongdirections and magnitudes of all eight directional edge responsesbut also represents more valuable structural information from theneighborhood. Our experimental results verify that the proposedmethod outperforms its two variant methods and nine recentstate-of-the-art methods and also obtains the best verification anddiscrimination ability on four publicly available image databasesand one video database.

References

[1] H. Han, S. Shan, X. Chen, W. Gao, A comparative study on illumination pre-processing in face recognition, Pattern Recognit. 46 (6) (2013) 1691–1699.

http://refhub.elsevier.com/S0925-2312(16)00360-X/sbref1




[2] M.R. Faraji, X. Qi, An effective neutrosophic set-based preprocessing methodfor face recognition, In: Proceedings of the International Conference on Mul-timedia Expo, 2013.

[3] M. Yang, D. Roth, N. Ahuja, A snow-based face detector, In: Advances in NeuralInformation Processing Systems 12, Massachusetts Institute of Technology(MIT), Cambridge, Massachusetts, USA, 2000, pp. 855–861.

[4] P. Viola, M. Jones, Rapid object detection using a boosted cascade of simplefeatures, Comput. Vis. Pattern Recognit. (2001) I-511–I-518.

[5] G.C. Luh, Face detection using combination of skin color pixel detection andviola-jones face detector, In: International Conference on Machine Learningand Cybernetics, vol. 1, 2014, pp. 364–370. http://dx.doi.org/10.1109/ICMLC.2014.7009143.

[6] M. Valstar, B. Martinez, X. Binefa, M. Pantic, Facial point detection usingboosted regression and graph models, In: IEEE Conference on Computer VisionAnd Pattern Recognition, 2010, pp. 2729–2736.

[7] B. Martinez, M. Valstar, X. Binefa, M. Pantic, Local evidence aggregation forregression-based facial point detection, IEEE Trans. Pattern Anal. Mach. Intell.35 (5) (2013) 1149–1163.

[8] A. Asthana, S. Zafeiriou, S. Cheng, M. Pantic, Robust discriminative responsemap fitting with constrained local models, Comput. Vis. Pattern Recognit.(2013).

[9] A. Asthana, S. Zafeiriou, S. Cheng, M. Pantic, Incremental face alignment in thewild, In: Computer Vision and Pattern Recognition, 2014, pp. 1859–1866.

[10] M. De Marsico, M. Nappi, D. Riccio, H. Wechsler, Robust face recognition foruncontrolled pose and illumination changes, IEEE Trans. Syst. Man Cybern.Syst. 43 (1) (2013) 149–163, http://dx.doi.org/10.1109/TSMCA.2012.2192427.

[11] V. Štruc, N. Pavešić, The complete Gabor–Fisher classifier for robust facerecognition, EURASIP J. Adv. Signal Process. 2010 (2010) 31:1–31:13, http://dx.doi.org/10.1155/2010/847680.

[12] Z. Lai, D. Dai, C. Ren, K. Huang, Multiscale logarithm difference edgemaps forface recognition against varying lighting conditions, IEEE Trans. Image Process.24 (6) (2015) 1735–1747, http://dx.doi.org/10.1109/TIP.2015.2409988.

[13] X. Tan, B. Triggs, Enhanced local texture feature sets for face recognition underdifficult lighting conditions, IEEE Trans. Image Process. 19 (6) (2010)1635–1650, http://dx.doi.org/10.1109/TIP.2010.2042645.

[14] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cogn. Neurosci. 3 (1) (1991)71–86.

[15] B.A. Draper, K. Baek, M.S. Bartlett, J. Beveridge, Recognizing faces with PCA andICA, Comput. Vis. Image Understand. 91 (12) (2003) 115–137, http://dx.doi.org/10.1016/S1077-3142(03)00077-8 (Special issue on Face Recognition).

[16] J.H. Shah, M. Sharif, M. Raza, M. Murtaza, Saeed-Ur-Rehman, Robust facerecognition technique under varying illumination, J. Appl. Res. Technol. 13 (1)(2015) 97–105, http://dx.doi.org/10.1016/S1665-6423(15)30008-0.

[17] M. Bartlett, J.R. Movellan, T. Sejnowski, Face recognition by independentcomponent analysis, IEEE Trans. Neural Netw. 13 (6) (2002) 1450–1464, http://dx.doi.org/10.1109/TNN.2002.804287.

[18] K.I. Kim, K. Jung, H.J. Kim, Face recognition using kernel principal componentanalysis, IEEE Signal Process. Lett. 9 (2) (2002) 40–42, http://dx.doi.org/10.1109/97.991133.

[19] D.L. Swets, J. Weng, Using discriminant eigenfeatures for image retrieval, IEEETrans. Pattern Anal. Mach. Intell. 18 (8) (1996) 831–836.

[20] Q. Liu, R. Huang, H. Lu, S. Ma, Face recognition using kernel-based Fisherdiscriminant analysis, In: Fifth IEEE International Conference on AutomaticFace and Gesture Recognition, 2002, pp. 197–201.

[21] Z. Sun, J. Li, C. Sun, Kernel inverse Fisher discriminant analysis for facerecognition, Neurocomputing 134 (0) (2014) 46–52, http://dx.doi.org/10.1016/j.neucom.2012.12.075.

[22] D. Masip, J. Vitria, Shared feature extraction for nearest neighbor face recog-nition, IEEE Trans. Neural Netw. 19 (4) (2008) 586–595, http://dx.doi.org/10.1109/TNN.2007.911742.

[23] F. Song, Z. Guo, Q. Chen, Two-dimensional nearest neighbor classifiers for facerecognition, In: International Conference on Systems and Informatics, 2012,pp. 2682–2686. http://dx.doi.org/10.1109/ICSAI.2012.62236077.

[24] S. Lawrence, C. Giles, A.C. Tsoi, A. Back, Face recognition: a convolutionalneural-network approach, IEEE Trans. Neural Netw. 8 (1) (1997) 98–113, http://dx.doi.org/10.1109/72.554195.

[25] T.H. Le, Applying artificial neural networks for face recognition, In: Advancesin Artificial Neural Systems, vol. 2011, 2011, 16 pp. http://dx.doi.org/10.1155/2011/673016.

[26] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, Y. Ma, Robust face recognition viasparse representation, IEEE Trans. Pattern Anal. Mach. Intell. 31 (2) (2009)210–227.

[27] A. Wagner, J. Wright, A. Ganesh, Z. Zhou, H. Mobahi, Y. Ma, Toward a practicalface recognition system: robust alignment and illumination by sparse repre-sentation, IEEE Trans. Pattern Anal. Mach. Intell. 34 (2) (2012) 372–386, http://dx.doi.org/10.1109/TPAMI.2011.112.

[28] A.F. Abate, M. Nappi, D. Riccio, G. Sabatino, 2D and 3D face recognition: asurvey, Pattern Recognit. Lett. 28 (2007) 1885–1906.

[29] W. Braje, D. Kersten, M. Tarr, N. Troje, Illumination effects in face recognition,Psychobiology 26 (4) (1998) 371–380.

[30] M.R. Faraji, X. Qi, Face recognition under varying illumination with logarithmicfractal analysis, IEEE Signal Process. Lett. 21 (12) (2014) 1457–1461, http://dx.doi.org/10.1109/LSP.2014.2343213.

[31] S.M. Pizer, E.P. Amburn, J.D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. terHaar Romeny, J.B. Zimmerman, K. Zuiderveld, Adaptive histogram equalizationand its variations, Comput. Vis. Graph. Image Process. 39 (3) (1987) 355–368.

[32] S. Shan, W. Gao, B. Cao, D. Zhao, Illumination normalization for robust facerecognition against varying lighting conditions, In: IEEE International Work-shop on Analysis and Modeling of Faces and Gestures, 2003, pp. 157–164.

[33] T. Ahonen, A. Hadid, M. Pietikainen, Face description with local binary pat-terns: application to face recognition, IEEE Trans. Pattern Anal. Mach. Intell. 28(12) (2006) 2037–2041.

[34] T. Jabid, M. Kabir, O. Chae, Local directional pattern (ldp) for face recognition,In: Digest of Technical Papers International Conference on Consumer Elec-tronics, 2010, pp. 329–330. http://dx.doi.org/10.1109/ICCE.2010.5418801.

[35] F. Zhong, J. Zhang, Face recognition with enhanced local directional patterns,Neurocomputing 119 (2013) 375–384.

[36] A. Ramirez Rivera, R. Castillo, O. Chae, Local directional number pattern forface analysis: face and expression recognition, IEEE Trans. Image Process. 22(5) (2013) 1740–1752, http://dx.doi.org/10.1109/TIP.2012.2235848.

[37] M.R. Faraji, X. Qi, Face recognition under illumination variations based oneight local directional patterns, IET Biom. 4 (1) (2015) 10–17.

[38] M.R. Faraji, X. Qi, Face recognition under varying illumination based onadaptive homomorphic eight local directional patterns, IET Comput. Vis. 9 (3)(2015) 390–399.

[39] Z. Lei, M. Pietikainen, S. Li, Learning discriminant face descriptor, IEEE Trans.Pattern Anal. Mach. Intell. 36 (2) (2014) 289–302, http://dx.doi.org/10.1109/TPAMI.2013.112.

[40] K. Juneja, A. Verma, S. Goel, An improvement on face recognition rate usinglocal tetra patterns with support vector machine under varying illuminationconditions, In: International Conference on Computing, CommunicationAutomation (ICCCA), 2015, pp. 1079–1084. http://dx.doi.org/10.1109/CCAA.2015.7148566.

[41] D.-J. Kim, M.-K. Shon, S. Lee, E. Kim, Illumination-robust face recognitionapproach using enhanced preprocessing and feature extraction, J. Nanoelec-tron. Optoelectron. 11 (2) (2016) 141–147, http://dx.doi.org/10.1166/jno.2016.1855.

[42] M. Ochoa-Villegas, J. Nolazco-Flores, O. Barron-Cano, I. Kakadiaris, Addressing theillumination challenge in two-dimensional face recognition: a survey, IET Comput.Vis. 9 (6) (2015) 978–992, http://dx.doi.org/10.1049/iet-cvi.2014.0086.

[43] X. Zou, J. Kittler, K. Messer, Illumination invariant face recognition: a survey,In: First IEEE International Conference on Biometrics: Theory, Applications,and Systems, 2007, pp. 1–8 http://dx.doi.org/10.1109/BTAS.2007.4401921.

[44] H. Wang, S. Li, Y. Wang, J. Zhang, Self quotient image for face recognition, In:International Conference on Image Processing, vol. 2, 2004, pp. 1397–1400.http://dx.doi.org/10.1109/ICIP.2004.1419763.

[45] H. Wang, S.Z. Li, Y. Wang, Face recognition under varying lighting conditionsusing self quotient image, In: Proceedings of the Sixth IEEE InternationalConference on Automatic Face and Gesture Recognition, 2004, pp. 819–824.http://dx.doi.org/10.1109/AFGR.2004.1301635.

[46] Y.K. Park, S.L. Park, J.K. Kim, Retinex method based on adaptive smoothing forillumination invariant face recognition, Signal Process. 88 (8) (2008)1929–1945, http://dx.doi.org/10.1016/j.sigpro.2008.01.028.

[47] T. Zhang, Y.Y. Tang, B. Fang, Z. Shang, X. Liu, Face recognition under varyingillumination using gradientfaces, IEEE Trans. Image Process. 18 (11) (2009)2599–2606, http://dx.doi.org/10.1109/TIP.2009.2028255.

[48] B. Wang, W. Li, W. Yang, Q. Liao, Illumination normalization based on weber'slaw with application to face recognition, IEEE Signal Process. Lett. 18 (8) (2011)462–465, http://dx.doi.org/10.1109/LSP.2011.2158998.

[49] Y. Wu, Y. Jiang, Y. Zhou, W. Li, Z. Lu, Q. Liao, Generalized weber-face forillumination-robust face recognition, Neurocomputing 136 (2014) 262–267,http://dx.doi.org/10.1016/j.neucom.2014.01.006.

[50] T. Song, K. Xiang, X. Wang, Face recognition under varying illumination basedon gradientface and local features, IEEJ Trans. Electr. Electron. Eng. 10 (2015)222–228.

[51] A. Baradarani, Q.M.J. Wu, M. Ahmadi, An efficient illumination invariant facerecognition framework via illumination enhancement and DD-DTcWT filter-ing, Pattern Recognit. 46 (1) (2013) 57–72.

[52] J. Chen, S. Shan, C. He, G. Zhao, M. Pietikainen, X. Chen, W. Gao, Wld: a robustlocal image descriptor, IEEE Trans. Pattern Anal. Mach. Intell. 32 (9) (2010)1705–1720, http://dx.doi.org/10.1109/TPAMI.2009.155.

[53] M.H. Fazel Zarandi, M.R. Faraji, M. Karbasian, Interval type-2 fuzzy expertsystem for prediction of carbon monoxide concentration in mega-cities, Appl.Soft Comput. 12 (1) (2012) 291–301.

[54] C.-C. Chen, J. DaPonte, M. Fox, Fractal feature analysis and classification inmedical imaging, IEEE Trans. Med. Imag. 8 (2) (1989) 133–142.

[55] O. Al-Kadi, D. Watson, Texture analysis of aggressive and nonaggressive lungtumor ce ct images, IEEE Trans. Biomed. Eng. 55 (7) (2008) 1822–1830.

[56] P. Kim, K. Iftekharuddin, P. Davey, M. Toth, A. Garas, G. Hollo, E. Essock, Novelfractal feature-based multiclass glaucoma detection and progression predic-tion, IEEE J. Biomed. Health Inform. 17 (2) (2013) 269–276.

[57] A. Zlatintsi, P. Maragos, Multiscale fractal analysis of musical instrument sig-nals with application to recognition, IEEE Trans. Audio Speech Lang. Process.21 (4) (2013) 737–748.

[58] R.C. Gonzalez, R.E. Woods, Digital Image Processing, Prentice-Hall, UpperSaddle River, New Jersey, USA, 2007.

dx.doi.org/10.1109/ICMLC.2014.7009143

dx.doi.org/10.1109/ICMLC.2014.7009143








http://dx.doi.org/10.1109/TSMCA.2012.2192427



dx.doi.org/10.1155/2010/847680

dx.doi.org/10.1155/2010/847680

http://dx.doi.org/10.1109/TIP.2015.2409988









http://dx.doi.org/10.1016/S1077-3142(03)00077-8

http://dx.doi.org/10.1016/S1077-3142(03)00077-8

http://dx.doi.org/10.1016/S1077-3142(03)00077-8

http://dx.doi.org/10.1016/S1077-3142(03)00077-8

http://dx.doi.org/10.1016/S1665-6423(15)30008-0

http://dx.doi.org/10.1016/S1665-6423(15)30008-0

http://dx.doi.org/10.1016/S1665-6423(15)30008-0

http://dx.doi.org/10.1109/TNN.2002.804287




http://dx.doi.org/10.1109/97.991133

http://dx.doi.org/10.1109/97.991133

http://dx.doi.org/10.1109/97.991133

http://dx.doi.org/10.1109/97.991133












dx.doi.org/10.1109/ICSAI.2012.6223607

http://dx.doi.org/10.1109/72.554195

http://dx.doi.org/10.1109/72.554195

http://dx.doi.org/10.1109/72.554195

http://dx.doi.org/10.1109/72.554195

dx.doi.org/10.1155/2011/673016

dx.doi.org/10.1155/2011/673016





http://dx.doi.org/10.1109/TPAMI.2011.112










http://dx.doi.org/10.1109/LSP.2014.2343213








dx.doi.org/10.1109/ICCV.2011.6126277


















dx.doi.org/10.1109/CCAA.2015.7148566

dx.doi.org/10.1109/CCAA.2015.7148566

http://dx.doi.org/10.1166/jno.2016.1855




http://dx.doi.org/10.1049/iet-cvi.2014.0086



dx.doi.org/10.1109/BTAS.2007.4401921

dx.doi.org/10.1109/ICIP.2004.1419763

dx.doi.org/10.1109/AFGR.2004.1301635

http://dx.doi.org/10.1016/j.sigpro.2008.01.028












































[59] Y.S. Huang, C.Y. Li, An effective illumination compensation method for facerecognition, In: Advances in Multimedia Modeling, Lecture Notes in ComputerScience, vol. 6523, 2011, pp. 525–535.

[60] Z. Guo, D. Zhang, D. Zhang, A completed modeling of local binary patternoperator for texture classification, IEEE Trans. Image Process. 19 (6) (2010)1657–1663, http://dx.doi.org/10.1109/TIP.2010.2044957.

[61] T. Sim, S. Baker, M. Bsat, The cmu pose, illumination, and expression (pie)database, In: Proceedings of the IEEE International Conference on AutomaticFace and Gesture Recognition, 2002, pp. 46–51. http://dx.doi.org/10.1109/AFGR.2002.1004130.

[62] A.S. Georghiades, P.N. Belhumeur, D. Kriegman, From few to many: illumi-nation cone models for face recognition under variable lighting and pose, IEEETrans. Pattern Anal. Mach. Intell. 23 (6) (2001) 643–660, http://dx.doi.org/10.1109/34.927464.

[63] A.M. Martinez, R. Benavente, Pca versus lda, CVC Technical Report 24,Columbus, Ohio, USA.

[64] K.-C. Lee, J. Ho, M.-H. Yang, D. Kriegman, Visual tracking and recognition usingprobabilistic appearance manifolds, Comput. Vis. Image Understand. 99 (3)(2005) 303–331, http://dx.doi.org/10.1016/j.cviu.2005.02.002.

[65] A.M. Martinez, A.C. Kak, Pca versus lda, IEEE Trans. Pattern Anal. Mach. Intell.23 (2) (2001) 228–233.

Mohammad Reza Faraji received the B.S. degree inApplied Mathematics from Zanjan University in 2005,the M.S. degree in Systems Engineering from AmirkabirUniversity of Technology in 2009. He received his Ph.D.degree in Summer 2015 in Computer Science at UtahState University. He is currently a postdoctoralresearcher with Computer Science Department at UtahState University. His research interests include patternrecognition, computer vision, and fuzzy systems.

Xiaojun Qi received her M.S. and Ph.D. degrees inComputer Science from Louisiana State University in1999 and 2001, respectively. In Fall 2002, she joined theDepartment of Computer Science at Utah State Uni-versity as a tenure-track assistant professor. In July2008, she received tenure and was promoted toAssociate Professor. In July 2015, she was promoted asProfessor. She has been a senior IEEE member since2010. Her research interests include content-basedimage retrieval, digital image watermarking and infor-mation hiding, and pattern recognition and computervision.




dx.doi.org/10.1109/AFGR.2002.1004130

dx.doi.org/10.1109/AFGR.2002.1004130

http://dx.doi.org/10.1109/34.927464

http://dx.doi.org/10.1109/34.927464

http://dx.doi.org/10.1109/34.927464

http://dx.doi.org/10.1109/34.927464

http://dx.doi.org/10.1016/j.cviu.2005.02.002






face recognition under varying illuminations using logarithmic fractal dimension-based ... · 2016....

Documents