robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

8
Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain Yong Cheng a,b, , Yingkun Hou a,c , Chunxia Zhao a , Zuoyong Li a,d , Yong Hu a , Cailing Wang a a School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, China b School of Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, China c School of Information Science and Technology, Taishan University, Taian, Shandong 271021, China d Department of Computer Science, Minjiang University, Fuzhou 350108, China article info Article history: Received 25 April 2009 Received in revised form 7 November 2009 Accepted 19 January 2010 Communicated by Y. Fu Available online 1 March 2010 Keywords: Face recognition Illumination invariant Nonsubsampled contourlet transform (NSCT) Normalshrink Image denoising abstract In order to alleviate the effect of illumination variations on face recognition, a novel face recognition algorithm based on illumination invariant in nonsubsampled contourlet transform (NSCT) domain is proposed. The algorithm first performs logarithm transform on original face images under various illumination conditions, which changes multiplicative illumination model into an additive one. Then NSCT is used to decompose the logarithm transformed images. After that, adaptive NormalShrink is applied to each directional subband of NSCT for illumination invariant extraction. Experimental results on the Yale B, the extended Yale and the CMU PIE face databases show that the proposed algorithm can effectively alleviate the effect of illumination on face recognition. & 2010 Elsevier B.V. All rights reserved. 1. Introduction Face recognition has become a very active research field in pattern recognition and computer vision due to its wide applications in human computer interaction, security, law enforcement and entertainment [1,2]. It has been proved that illumination variations are more significant than the inherent differences between individuals for face recognition [3,4]. Although various methods for face recognition have been proposed, such as PCA [5], LDA [6], LFA [7], EBGM [8], Probabilistic and Bayesian Matching [9] and SVM [10], the performance of most existing algorithms is highly sensitive to illumination variations. In order to solve the problem, a number of methods have been proposed. They fall into three groups. The first is to preprocess face images by using some techniques normalizing the images, such as logarithm transform and histogram equalization (HE), which is robust under different illumination conditions and often used for illumination normalization [11,21]. However, it is still difficult to deal with complicated illumination variations by the above global processing techniques. Lately, block-based histogram equalization (BHE) and adaptive histogram equaliza- tion (AHE) were proposed to cope with illumination variations [24,25]. Their performances are still not satisfactory, even though the recognition rates can be improved a little compared with HE. The second is to construct a generative 3-D face model for rendering face images with different poses and varying illumina- tion conditions [13,14]. A generative appearance-based model was presented for face recognition under variations in illumina- tion conditions [13]. Its main idea is that face images under various illumination conditions can be represented by using an illumination convex cone, and the corresponding cone can be approximated in a low-dimensional linear subspace. Basri and Jacobs [14] approximated the set of convex Lambertian images under a variety of illumination conditions by a 9D linear subspace. The drawbacks of the 3D model-based methods are that many training samples under varying illumination conditions are needed and there is an assumption that the human face is a convex object, which makes this method inapplicable for practical applications. The third is to deal with illumination variations based on the Lambertian model, such as Multiscale Retinex (MSR) [28], Self Quotient Image (SQI) [15,16], logarithmic total variation (LTV) [17], Gross and Brajovie (GB) [29], wavelet-based illumina- tion normalization (WIN) [26], wavelet-based illumination invariant preprocessing (WIIP) [27], Multiscale facial structure representation (MFSR) [32] and illumination normalization by a ARTICLE IN PRESS Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/neucom Neurocomputing 0925-2312/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.neucom.2010.01.012 Corresponding author at: School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, China. Tel.: + 86 02584315751. E-mail address: [email protected] (Y. Cheng). Neurocomputing 73 (2010) 2217–2224

Upload: yong-cheng

Post on 10-Sep-2016

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Neurocomputing 73 (2010) 2217–2224

Contents lists available at ScienceDirect

Neurocomputing

0925-23

doi:10.1

� Corr

Univers

Tel.:+86

E-m

journal homepage: www.elsevier.com/locate/neucom

Robust face recognition based on illumination invariant in nonsubsampledcontourlet transform domain

Yong Cheng a,b,�, Yingkun Hou a,c, Chunxia Zhao a, Zuoyong Li a,d, Yong Hu a, Cailing Wang a

a School of Computer Science and Technology, Nanjing University of Science and Technology, Nanjing 210094, Chinab School of Communication Engineering, Nanjing Institute of Technology, Nanjing 211167, Chinac School of Information Science and Technology, Taishan University, Taian, Shandong 271021, Chinad Department of Computer Science, Minjiang University, Fuzhou 350108, China

a r t i c l e i n f o

Article history:

Received 25 April 2009

Received in revised form

7 November 2009

Accepted 19 January 2010

Communicated by Y. FuNSCT is used to decompose the logarithm transformed images. After that, adaptive NormalShrink is

Available online 1 March 2010

Keywords:

Face recognition

Illumination invariant

Nonsubsampled contourlet transform

(NSCT)

Normalshrink

Image denoising

12/$ - see front matter & 2010 Elsevier B.V. A

016/j.neucom.2010.01.012

esponding author at: School of Computer Scie

ity of Science and Technology, Nanjing 21009

02584315751.

ail address: [email protected] (Y. Ch

a b s t r a c t

In order to alleviate the effect of illumination variations on face recognition, a novel face recognition

algorithm based on illumination invariant in nonsubsampled contourlet transform (NSCT) domain is

proposed. The algorithm first performs logarithm transform on original face images under various

illumination conditions, which changes multiplicative illumination model into an additive one. Then

applied to each directional subband of NSCT for illumination invariant extraction. Experimental results

on the Yale B, the extended Yale and the CMU PIE face databases show that the proposed algorithm can

effectively alleviate the effect of illumination on face recognition.

& 2010 Elsevier B.V. All rights reserved.

1. Introduction

Face recognition has become a very active research field inpattern recognition and computer vision due to its wideapplications in human computer interaction, security, lawenforcement and entertainment [1,2]. It has been proved thatillumination variations are more significant than the inherentdifferences between individuals for face recognition [3,4].Although various methods for face recognition have beenproposed, such as PCA [5], LDA [6], LFA [7], EBGM [8], Probabilisticand Bayesian Matching [9] and SVM [10], the performance ofmost existing algorithms is highly sensitive to illuminationvariations. In order to solve the problem, a number of methodshave been proposed. They fall into three groups. The first is topreprocess face images by using some techniques normalizing theimages, such as logarithm transform and histogram equalization(HE), which is robust under different illumination conditionsand often used for illumination normalization [11,21]. However, itis still difficult to deal with complicated illumination variations bythe above global processing techniques. Lately, block-based

ll rights reserved.

nce and Technology, Nanjing

4, China.

eng).

histogram equalization (BHE) and adaptive histogram equaliza-tion (AHE) were proposed to cope with illumination variations[24,25]. Their performances are still not satisfactory, even thoughthe recognition rates can be improved a little compared with HE.The second is to construct a generative 3-D face model forrendering face images with different poses and varying illumina-tion conditions [13,14]. A generative appearance-based modelwas presented for face recognition under variations in illumina-tion conditions [13]. Its main idea is that face images undervarious illumination conditions can be represented by using anillumination convex cone, and the corresponding cone can beapproximated in a low-dimensional linear subspace. Basri andJacobs [14] approximated the set of convex Lambertian imagesunder a variety of illumination conditions by a 9D linear subspace.The drawbacks of the 3D model-based methods are that manytraining samples under varying illumination conditions areneeded and there is an assumption that the human face is aconvex object, which makes this method inapplicable for practicalapplications. The third is to deal with illumination variationsbased on the Lambertian model, such as Multiscale Retinex (MSR)[28], Self Quotient Image (SQI) [15,16], logarithmic total variation(LTV) [17], Gross and Brajovie (GB) [29], wavelet-based illumina-tion normalization (WIN) [26], wavelet-based illuminationinvariant preprocessing (WIIP) [27], Multiscale facial structurerepresentation (MFSR) [32] and illumination normalization by a

Page 2: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–22242218

series of processing (INPS) [30]. MSR deals with illuminationvariations by the difference between an original image and itssmoothed version in logarithm domain, which is obtainedby combining several low-pass filters with different cut-offfrequencies. But the halo effect of MSR is serious. GB introducedby Gross and Brajovie can reduce halo effect to some extentby using an anisotropic filter. In SQL model, the illuminationeffect is normalized by division over a smoothed version of theimage itself. This model is very simple and can be appliedto a single image, but its use of the weighted Gaussian filtermakes it difficult to keep sharp edges in low frequencyillumination fields. LTV improves SQI by using logarithmic totalvariation, which can only process images under certain scale andhas quite high computational expense. WIN, WIIP and MFSRattempt to normalize varying illumination by modifying waveletcoefficients. But illumination effect cannot be completelyremoved and Gibbs phenomena are serious. INPS is proposed tocope with illumination effect by a combination of gammacorrection, difference of Gaussian filtering and contrast equaliza-tion in [30]. However, its parameter selection is usually empiricaland complicated, and the number of parameters is no less thanfive.

To cope with the illumination variations in face recognition,we proposed a novel method employing NormalShrink filter inNSCT domain to extract illumination invariant. Compared withthe existing methods, our method has the following advantages:(1) it can better preserve edges due to the nonsubsampledcontourlet transform with multiscale, multidirection analysisand shift-invariance, (2) it can directly detect multiscale contourstructure that is illumination invariant in the logarithm domainof a single face image, (3) the prior information (i.e., light sourcesassumption and large training samples) is necessary for 3Dface model, but it is unnecessary for our method. Experimentalresults on the Yale B, the extended Yale and the CMU PIE facedatabases show that the proposed method is robust and effectivefor face recognition with varying illumination conditions.

The rest of this paper is organized as follows: Section 2describes the NSCT-based method for illumination invariantextraction. The experimental results are presented in Section 3.Finally, we give conclusions and future work in Section 4.

Fig. 1. Logarithm transform: (a) original and (b) logarithm image of (a).

2. Methodology

2.1. Illumination model and logarithm transform

On the basis of the Lambertian model, a facial gray image F

under illumination conditions is generally regarded as thefollowing model [18]:

Fðx; yÞ ¼ Iðx; yÞ � Rðx; yÞ; ð1Þ

where I(x,y) and R(x,y) are respectively the illumination and thereflectance at a point (x,y). R can also be regarded as anillumination invariant feature. As for robust face recognitionunder various illumination conditions, take R as the key facialfeature due to its stability. However, it is an ill-posed problemto extract key facial features by solving Eq. (1) directly. Hence, acommon assumption is that I changes slowly and R variesabruptly, that is, I is regarded as the low frequency of signal F,and R is the high frequency which can be regarded as ‘‘noise’’ in anoisy image.

In our method, NormalShrink-based denoising model isemployed to extract R. In order to use denoising model forillumination invariant R extraction, we first perform the logarithmoperator on an image, which can change the noisy model frommultiplicative model to additive one. Thus, Eq. (1) can be

rewritten as

F 0 ¼ I0 þR0; ð2Þ

where F 0 � logðFÞ; I0 � logðIÞ;R0 � logðRÞ. In addition, the logarithmtransform is often used for image enhancement, because it cancompress the light pixel values and expand the dark pixel ones[19]. As a result, logarithm transform can partially reduce theeffect of various illumination conditions. An example in Fig. 1shows an original image and its corresponding logarithm image.

2.2. Nonsubsampled contourlet transform

In [12], Do and Vetterli proposed contourlet transform (CT) torepresent two dimensional singularities, which is composed ofLaplacian pyramid (LP) and directional filter bank (DFB). Thetransform can represent curve more sparsely due to its direction-ality and anisotropy. However, there exists frequency aliasing inthe process of contourlet transform.

In order to eliminate the frequency aliasing, enhance direc-tional selectivity and shift-invariance, Cunha et al. [20] proposednonsubsampled contourlet transform (NSCT) based on nonsub-sampled pyramid decomposition and nonsubsampled filter banks(NSFB). NSCT is the shift-invariant version of CT, and uses iteratednonseparable two-channel NSFB to obtain the shift-invarianceavoiding pseudo-Gibbs phenomena around singularities. NSCT notonly provides multiresolution analysis, but also contains geometricand directional representation. Fig. 2 gives nonsubsampledcontourlet transform. An overview of NSCT is shown in Fig. 2(a).The structure composed of a bank of filters splitting the 2-Dfrequency plane in the subbands is illustrated in Fig. 2(b). Theinverse transform of NSCT is just an inverse process of forwardtransform, but the filter banks of inverse transform are differentfrom the forward transform ones. NSCT can get the perfectreconstruction through reconstructing filter banks. The concretefilter banks construction methods and more NSCT details can befound in [20]. NSCT is more efficient than other multiresolutionanalysis in image denoising and image enhancement due to itsmultiscale, multidirection, anisotropy and shift-invariance. There-fore, we perform multiscale decomposition on face images by NSCTin the proposed method.

2.3. NSCT-based denoising model for illumination invariant

extraction

As mentioned above, under the assumption that I0 variesslowly and R0 changes abruptly, I0 and R0 can be regarded as lowfrequency and high frequency parts of face, respectively. Hence,the illumination invariant under different illumination conditions

Page 3: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Fig. 2. Nonsubsampled contourlet transform: (a) NSFB structure and (b) idealized frequency partitioning obtained by NSFB.

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–2224 2219

is equivalent to ‘‘noise’’ of the images. But it is still an ill-posedproblem to absolutely separate R0 and I0. The reason is that thereflection coefficients of face under distinctive illuminations aredifferent with each other. In order to extract robust illuminationinvariant R0 under different illumination conditions, we resort toan optimization problem [32], that is,

arg min R02 ¼ arg minðF 0�I0Þ2: ð3Þ

Here, Eq. (3) can be solved by image denoising techniques.Besides, NSCT-based denoising techniques can better preservecurve information in low frequency illumination fields than othermultiscale analysis [20]. So NSCT-based image denoising techni-ques are used to estimate low frequency part I0, and theillumination invariant R0 can be extracted by the followingequation:

R0 ¼ F 0�I0 ð4Þ

The I0 can be obtained by a nonlinear approximation of NSCTcoefficients of the logarithm image F0. That is, first, decompose animage by NSCT, then apply denoising model to the high frequencysubbands of NSCT, finally, we reconstruct illumination I0 from themodified NSCT coefficients by inverse NSCT.

All the NSCT subands are given as the following:

fCi0 ;Ci;jg; i; i0 ¼ 1;2; � � � ;n; ir i0; j¼ 2;4;8 � � � ;m;nAN;mA2N ; ð5Þ

where i is the scale of decomposition, j is the direction ofdecomposition, Cio is the low frequency coefficient, Ci,j is the highfrequency coefficient at the j-th directional subband of the i-thscale. In our model, i0=3 and j={2, 4, 8}. In other words, the scaleof decomposition is 3, and directions of decomposition in eachscale are 2, 4 and 8 respectively.

Selecting threshold is a key issue for image denoising. Thereare several methods for image denoising in wavelet domain, suchas VisuShrink [34], SureShrink [35] and BayesShrink [36].However, threshold selection in these methods does not considerthe texture of an image. Kaur et al. proposed a Normalshrinkmethod, which outperforms generally above methods [22,23].Therefore, we use it to select threshold, the correspondingthreshold ti,j of Ci,j can be estimated as follows [22]:

ti;j ¼b _s2

i;j

_sxi;j

; ð6Þ

ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffilog

Li;j

S

� �s; ð7Þ

_s i;j ¼MedianjCi;jj

l; ð8Þ

_sxi;j ¼

1

m � n

Xm

k ¼ 1

Xn

l ¼ 1

ðCi;jðk; lÞ�Ci;j Þ2; ð9Þ

where l is a parameter, whose general value is 0.6745, m and n

are the height and width of an image, Lij is the length of the j-thdirectional subband of the i-th scale, S is the maximal scale ofNSCT, and Cij is the average of coefficients at the j-th directionalsubband of the i-th scale.

In general, soft thresholding can obtain better performance, ascompared with hard thresholding. However, soft thresholding haslarge deviation between the denoised coefficients and the originalones. To extract more robust illumination invariant by softthresholding filter, the above threshold ti,j is independentlyapplied to the coefficients in each directional subband, and thesoft thresholding filter is defined as follows [23]:

Cthri;j ¼

0 jCi;jjoti;j

Ci;j�wti;j Ci;jZti;j

Ci;jþwti;j Ci;jr�ti;j

;

8><>: ð10Þ

w¼ti;j

jCi;jðk; lÞjexpðjCi;jðk; lÞj�ti;jÞ: ð11Þ

where Ci;jðk; lÞ is the coefficient of point (k,l) at the j-th directionalsubband of i-th scale, ti,j is the threshold of the j-th directionalsubband of the i-th scale, and w is a weighing factor for differentsubbands. From Eq. (11), we can see that w decreases whileCi;jðk; lÞ increases. Therefore, the soft thresholding filter can reducethe deviation between Cthr

i;j and Ci;j.In our denoising model, we only remove the noise signal from

the coefficients of the directional subbands Ci;j by using the softthresholding filter, while keeping the low frequency coefficientsCi0 unchanged. Hence, good curve information in low frequencyillumination fields can be preserved. After denoising process, theillumination I0 can be reconstructed from the modified NSCTcoefficients. Then, we can extract the illumination invariant R0 byEq. (4). Some samples of illumination invariant extracted by ourmodel with different l are illustrated in Fig. 3.

On the basis of the above descriptions, the proposed methodfor illumination invariant extraction can be described as follows:

Step 1. Take the logarithm operation for the original face imageF, F 0 � logðFÞ.

Step 2. Perform multiscale decomposition of the logarithmfacial image F0 by using NSCT.

Step 3. Compute threshold ti;j at each directional subband ofeach scale by using Eqs. (6)–(9).

Step 4. Apply soft thresholding to each directional subband ofeach scale by using Eqs. (10), (11).

Step 5. Reconstruct illumination I0 from the modified NSCTcoefficients by inverse NSCT.

Page 4: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Fig. 3. Image and illumination invariant: (a) originals, (b) illumination invariant

with l=0.1 and (c) illumination invariant with l=3.1.

Table 1Parameter setting of the proposed method and other methods.

Methods Parameters

MSR k=5,9,15

LTV l=0.8

INPS g=0.2, s0=1, s1=2, a=0.1, t=10

Proposed method l=2.25

Fig. 4. Original images and their illumination invariants: (a) originals, (b) GB, (c)

MSR, (d) LTV, (e) INPS and (f) the proposed method.

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–22242220

Step 6. Obtain the illumination invariant R0 by usingEq. (4).

2.4. Comparison with MSR, GB, LTV and INPS of illumination

invariants

In this section, we compare illumination invariants producedby the proposed method, MSR, GB, LTV, and INPS. The parametersemployed in our paper are presented in Table 1, which areconsistent with those in MSR [28], LTV [17] and INPS [30]. Fig. 4illustrates illumination invariants extracted by these methodsfrom four images with different illumination conditions. As canbe seen, the proposed method produces better natural imagesthan MSR, GB, LTV and INPS. Moreover these methods producemore differences between the illumination invariants from thefour images with different illumination conditions than ours.Therefore, the results show that the proposed method is morerobust to different illumination conditions than MSR, GB, LTV andINPS.

3. Experimental results and discussions

To evaluate the performance of the proposed method forillumination invariant extraction, we have applied it to threewell-known databases (i.e., the Yale B, the extended Yale and theCMU PIE face databases), which are often utilized to examine thesystem performance when facial illumination varies. The resultsyielded by our method were compared with those obtained byLog [11], MSR [28], LTV [17], GB [29], WIIP [27] and INPS [30].Quality of results is quantitatively evaluated by recognition rates.In the phase of recognition, PCA is employed to extract globalfeature, and the Nearest neighbor classifier based on Euclideandistance is used for classification.

3.1. Experiments on the Yale face database B

The Yale face database B contains ten individuals under 64different illumination conditions for nine poses [13]. Since weonly focus on the illumination problem in this paper, only frontalface images under varying illumination conditions are chosen assamples. All the original images are of 640�480. In ourexperiments, all the images are manually cropped based on thepositions of the eyes and resized to 48�42, which include onlythe face with as little hair and background as possible. In addition,the cropped images fall into five subsets, i.e., subset 1 (0–12),subset 2 (13–25), subset 3 (26–50), subset 4 (51–77) and subset 5(above 78), according to illumination angle. Fig. 5 shows fiveimages of each subset for one person and correspondingillumination invariants obtained by the proposed method.

In the proposed method, we use three-level NSCT decomposi-tion, directions of decomposition in each scale are 2, 4 and 8,respectively, and the parameter l takes 2.25 in Eq. (8). In order toevaluate the performance of the proposed method, we performthree groups of experiments. The first group chooses one of fivesubsets as training set, and the rest as testing set, respectively.Tables 2–6 list the corresponding quantitative results. From thetables, one can observe that recognition rates of the proposedmethod are higher than those of other methods, implying betterrecognition performance. In the second group, a random subsetwith 10 images per person is chosen as training set, and the restof images as testing set. Since the above training set and testingset are random, we run the simulation 60 times and average theresults over them. The average recognition rates of variousmethods are listed in Table 7. From the table, one can concludethat our average recognition rate reaches 100%, higher thanothers. As compared with the previous group, the last one onlychanges the value of l. The average recognition rates are shown inFig. 6, which further demonstrates the effectiveness of ourmethod.

3.2. Experiments on the extended Yale face database

The extended Yale Face Database is composed of 38 humansubjects under 9 poses and 64 illumination conditions [31]. Thedata format of this database is the same as the Yale B. Only frontalface images under varying illumination conditions are used in this

Page 5: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Fig. 5. Original images and their illumination invariants: (a) originals and (b) illumination invariant obtained by the proposed method.

Table 2Recognition rates (%) of various methods when using images of subset 1 as training

set.

Methods Subset 2 Subset 3 Subset 4 Subset 5

Original 100.00 74.17 45.00 18.95

Log 100.00 75.00 56.43 49.47

WIIP 94.17 90.00 82.86 55.26

GB 100.00 99.17 97.86 98.42

MSR 100.00 98.33 92.86 90.00

LTV 100.00 99.17 97.86 95.79

INPS 100.00 99.17 94.29 95.79

Our method 100.00 100.00 100.00 100.00

Table 3Recognition rates (%) of various methods when using images of subset 2 as training

set.

Method Subset 1 Subset 3 Subset 4 Subset 5

Original 100.00 56.67 31.43 16.32

Log 100.00 66.68 45.71 55.79

WIIP 100.00 84.17 83.57 62.63

GB 100.00 98.33 99.29 99.47

MSR 100.00 96.67 93.57 94.21

LTV 100.00 98.33 99.29 99.47

INPS 100.00 100.00 97.14 99.47

Our method 100.00 100.00 100.00 100.00

Table 4Recognition rates (%) of various methods when using images of subset 3 as training

set.

Method Subset 1 Subset 2 Subset 4 Subset 5

Original 95.71 72.50 22.14 14.74

Log 95.71 78.33 35.00 42.11

WIIP 97.14 89.17 90.71 80.53

GB 100.00 100.00 98.57 99.47

MSR 100.00 100.00 98.57 98.42

LTV 100.00 100.00 99.29 99.47

INPS 100.00 100.00 100.00 98.95

Our method 100.00 100.00 100.00 99.47

Table 5Recognition rates (%) of various methods when using images of subset 4 as training

set.

Method Subset 1 Subset 2 Subset 3 Subset 5

Original 58.57 53.33 37.50 33.68

Log 82.86 65.83 54.17 53.68

WIIP 90.00 86.67 90.83 97.90

GB 100.00 100.00 99.17 99.47

MSR 97.14 100.00 99.17 100.00

LTV 100.00 100.00 100.00 99.47

INPS 98.57 97.50 100.00 99.47

Our method 100.00 100.00 100.00 100.00

Table 6Recognition rates (%) of various methods when using images of subset 5 as training

set.

Method Subset 1 Subset 2 Subset 3 Subset 4

Original 17.14 16.67 25.00 45.00

Log 72.86 74.17 66.67 63.57

WIIP 55.71 69.17 69.17 85.00

GB 97.14 100.00 100.00 100.00

MSR 87.14 92.50 96.67 99.29

LTV 100.00 94.17 99.17 98.57

INPS 87.14 90.00 99.17 99.29

Our method 100.00 100.00 100.00 100.00

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–2224 2221

experiment. Image cutting mode and selection of parameters arethe same as those in Section 3.1. A random subset with 10 imagesper person from subset 1–5 is chosen as training set, and the restof images for testing. To obtain reliable result, we average theresults over 60 random tests. The corresponding averagerecognition rates are listed in Table 8, which shows that ourrecognition rate is still higher than others.

3.3. Experiments on the CMU PIE face database

The CMU PIE face database consists of 68 subjects withdifferent poses, illuminations and expressions [33]. Here, onlyfrontal face images under different illumination conditions(without expression variations) are used. Face images in the

Page 6: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Table 7Average recognition rates (%) of various methods when randomly choosing ten images per person as training set.

Method Original Log WIIP GB MSR LTV INPS Our method

Average recognition rate 49.32 68.29 91.38 99.51 97.74 99.60 99.53 100.00

0 1 2 3 4 5 6 7 899

99.1

99.2

99.3

99.4

99.5

99.6

99.7

99.8

99.9

100

Rec

ogni

tion

Acc

uran

cy(%

)

Fig. 6. Recognition rates obtained by the proposed method on the Yale B under different l.

Table 8Average recognition rates (%) of various methods when randomly choosing ten images per person as training set.

Method Original Log WIIP GB MSR LTV INPS Our method

Average recognition rate 34.96 53.04 76.36 91.38 90.73 94.04 93.43 96.09

Fig. 7. Original images and illumination invariant: (a) originals on CMU PIE and (b) illumination invariant obtained by the proposed method.

Table 9Average rates (%) of various methods when randomly choosing three images per person as training set.

Method Original Log WIIP GB MSR LTV INPS Our method

Average recognition rate 54.27 58.77 90.60 98.24 95.18 99.20 99.28 99.73

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–22242222

experiment are all cropped in the same way as the Yale B andresized to 64�64. Fig. 7(a) gives 21 images of one subject underdifferent illumination conditions, and corresponding illuminationinvariants obtained by the proposed method are displayed inFig. 7(b).

In our experiments, three images per person are randomlychosen as training set, and the rest as testing set, where l is 0.15.To obtain reliable result, we run the simulation 20 times andaverage the results over them. Corresponding average recognitionrates of various methods are listed in Table 9. After changing the

Page 7: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

0 1 2 3 4 5 6 7 890

91

92

93

94

95

96

97

98

99

100

Rec

ogni

tion

Acc

uran

cy (%

)

Values of λ

Fig. 8. Recognition rates obtained by the proposed method on CMU PIE under different l.

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–2224 2223

value of l, we repeat the previous experiment and display thecorresponding results in Fig. 8. Both the table and the figurefurther demonstrate the effectiveness of our method.

4. Conclusions and future work

In this paper, a novel NSCT-based illumination invariantextraction method is developed. The proposed method can extractillumination invariant from multiscale space. Compared withother methods, it can be directly applied in a single face imagewithout any preprocessing and conditions. The new method canbetter extract geometric structure (i.e., illumination invariant)without pseudo-Gibbs phenomena around singularities and haloartifacts, which attributes to the properties of nonsubsampledcontourlet transform, i.e., multiscale, multidirection analysiscapability and shift-invariance. Accurate extraction of geometricstructure can better improve recognition rates. Experimentalresults on three well-known face databases show that theproposed method is effective for extracting illumination invariant.In addition, it can be applied to other aspects such as imagealignment and object tracking.

However, there are still some issues deserving further study.For example, illumination invariant extracted by our method canchange with l. Therefore, adaptive selection of l for variousillumination images is still an open problem. In addition, itremains a future task to develop a more suitable denoising modelfor illumination invariant extraction in nonsubsampled contourlettransform domain.

Acknowledgments

This work is partially supported by the National ScienceFoundation of China (Grant nos. 90820306, 60632050 and60503026), Nanjing Institute of Technology Internal Fund (Grantno. KXJ06037) and Technology Project of provincial university ofFujian Province (Grant no. 2008F5045). Finally, the authors would

like to express their heartfelt thanks to the anonymous reviewersfor their constructive advice.

References

[1] R. Chellappa, C.L. Wilson, S. Sirohey, Human and machine recognition offaces: a survey, Proc. IEEE 83 (5) (1995) 705–741.

[2] K.W. Bowyer, K. Chang, P.J. Flynn, A survey of approaches and challenges in3D and multi-modal 3D+2D face recognition, Comput. Vis. Image Under-standing 101 (1) (2006) 1–15.

[3] Y. Adini, Y. Moses, S. Ullman, Face recognition: the problem of compensatingfor changes in illumination direction, IEEE Trans. Pattern Anal. Mach. Intell.19 (7) (1997) 721–732.

[4] W. Zhao, R. Chellappa, Robust face recognition using symmetric shape-from-shading, Technical Report, Center for Automation Research, University ofMaryland, 1999.

[5] M. Turk, A. Pentland, Eigenfaces for recognition, J. Cognitive Neurosci. 3 (1)(1991) 71–86.

[6] P.N. Belhumeur, J.P. Hespanha, D.J. Kriegman, Eigenfaces vs. fisherfaces:recognition using class specific linear projection, IEEE Trans. Pattern Anal.Mach. Intell. 19 (7) (1997) 711–720.

[7] P. Oenev, J. Atick, Local feature analysis: a general statistical theory for objectrepresentation, Network. Comput. Neural Syst. 7 (3) (1996) 477–500.

[8] L. Wiskott, J.M. Fellous, N. Kruger, C.V.D. Malsburg, Face recognition by elasticbunch graph matching, IEEE Trans. Pattern Anal. Mach. Intell. 19 (7) (1997)775–779.

[9] B. Moghaddam, T. Jebara, A. Pentland, Bayesian face recognition, PatternRecognition 33 (11) (2000) 1771–1782.

[10] G. Guo, S.Z. Li, K. Chan, Face recognition by support vector machines, in:Proceedings of the IEEE International Conference on Automatic Face andGesture Recognition, 2000, pp. 196–201.

[11] M. Savvides, V. Kumar, Illumination normalization using logarithm trans-forms for face authentication, in: Proceedings of the IAPRAVBPA, 2003,pp. 549–556.

[12] M.N. Do, M. Vetterli, The Contourlet transform: an efficient directionalmultiresolution image representation, IEEE Trans. Image Process. 14 (12)(2005) 2091–2106.

[13] A.S. Georghiades, P.N. Belhumeur, D.W. Jacobs, From few to many:illumination cone models for face recognition under variable illuminationand pose, IEEE Trans. Pattern Anal. Mach. Intell. 23 (6) (2001) 630–660.

[14] R. Basri, D.W. Jacobs, Lambertian reflectance and linear subspaces, IEEE Trans.Pattern Anal. Mach. Intell. 25 (2) (2003) 218–233.

[15] S. Amnon, R.R. Tammy, The quotient image: class-based re-rendering andrecognition with varying illuminations, IEEE Trans. Pattern Anal. Mach. Intell.23 (2) (2001) 129–139.

[16] H. Wang, S.Z. Li, Y. Wang, Face recognition under varying illuminationconditions using self quotient image, in: Proceedings of the IEEE InternationalConference on Automatic Face and Gesture Recognition, 2004, pp. 819–824.

Page 8: Robust face recognition based on illumination invariant in nonsubsampled contourlet transform domain

ARTICLE IN PRESS

Y. Cheng et al. / Neurocomputing 73 (2010) 2217–22242224

[17] T. Chen, W. Yin, X.S. Zhou, D. Comaniciu, T.S. Huang, Total variation modelsfor variable illumination face recognition, IEEE Trans. Pattern Anal. Mach.Intell. 28 (9) (2006) 1519–1524.

[18] B.K.P. Horn, Robot Vision, MIT Press, Cambridge, MA, 1986.[19] Y. Adini, Y. Moses, S. Ullman, Face recognition: the problem of compensating

for changes in illumination direction, IEEE Trans. Pattern Anal. Mach. Intell.19 (7) (1997) 721–732.

[20] A.L. da Cunha, J.P. Zhou, M.N. Do, The nonsubsampled contourlet transform:theory, design, and applications, IEEE Trans. Image Process. 15 (10) (2006)3089–3101.

[21] S. Shan, W. Gao, B. Cao, D. Zhao, Illumination normalization for robust facerecognition against varying illumination conditions, in: Proceedings of theIEEE Workshop on AMFG, 2003, pp. 157–164.

[22] L. Kaur, S. Gupta, R.C. Chauhan, Image denoising using wavelet thresholding,in: India Conference on Computer Vision, Graphics and Image Processing,2002.

[23] L. Huang, H. Wang, B. Zhu, Adaptive thresholds algorithm of image denoisingbased on nonsubsampled contourlet transform, in: Proceedings of the IEEEInternational Conference on Computer Science and Software Engineering, vol.6, 2008, pp. 209–212.

[24] S.M. Pizer, E.P. Amburn, Adaptive histogram equalization and its variations,Comput. Vis. Graph. Image Process. 39 (3) (1987) 355–368.

[25] X. Xie, K.-M. Lam, Face recognition under varying illumination based on a 2Dface shape model, Pattern Recognition 38 (2) (2005) 221–230.

[26] S. Du, R. Ward, Wavelet-based illumination normalization for face recogni-tion, in: Proceedings of the IEEE International Conference on ImageProcessing, vol. 2, 2005, pp. 954–957.

[27] Y.Z. Goh, A.B.J. Teoh, M.K.O. Goh, Wavelet based illumination invariantpreprocessing in face recognition, in: Proceedings of the Congress on Imageand Signal Processing, vol. 3, 2008, pp. 421–425.

[28] D.J. Jobson, Z. Rahman, G.A. Woodell, A multiscale Retinex for bridging thegap between color images and the human observation of scenes, IEEE Trans.Image Process. 6 (7) (1997) 965–976.

[29] R. Gross, V. Brajovic, An image preprocessing algorithm for illuminationinvariant face recognition, in: J. Kittler, M.S. Nixon, (Eds.) AVBPA 2003,Lecture Notes in Computer Science, vol. 2688, 2003, pp. 10–18.

[30] X. Tan, B. Triggs, Enhanced local texture feature sets for face recognitionunder difficult illumination conditions, AMFG 2007, Lecture Notes inComputer Science, vol. 4778, 2007, pp. 168–182.

[31] K.C. Lee, J. Ho, D. Kriegman, Acquiring linear subspaces for face recognitionunder variable lighting, IEEE Trans. Pattern Anal. Mach. Intell. 27 (5) (2005)684–698.

[32] T. Zhang, B. Fang, Y. Yuan, Y.Y. Tang, Z. Shang, D. Li, F. Lang, Multiscale facialstructure representation for face recognition under varying illumination,Pattern Recognition 42 (2) (2009) 251–258.

[33] T. Sim, S. Baker, M. Bsat, The CMU pose, illumination, and expression (PIE)database, in: Proceedings of the IEEE International Conference on AutomaticFace and Gesture Recognition, 2002.

[34] D.L. Donoho, I.M. Johnstone, Ideal spatial adaptation via wavelet shrinkage,Biometrika 81 (3) (1994) 425–455.

[35] D.L. Donoho, I.M. Johnstone, Adapting to unknown smoothness via waveletshrinkage, J. Am. Stat. Assoc. 90 (432) (1995) 1200–1224.

[36] S.G. Chang, B. Yu, M. Vetterli, Adaptive wavelet thresholding for imagedenoising and compression, IEEE Trans. Image Process. 9 (9) (2000)1532–1546.

Yong Cheng received the M.S. degree in computerscience and technology from Nanjing University ofInformation Science and Technology in 2004. Now, heis a Ph.D. Candidate in Nanjing University of Scienceand Technology, China. His research interests includeimage processing and pattern recognition.

Yingkun Hou received the M.S. degree in appliedmathematics from Shandong University of Science andTechnology in 2006. He is a Ph.D. Candidate now inNanjing University of Science and Technology, China.His research interests include image processing andpattern recognition.

Chunxia Zhao received the B.S. degree in electricengineering and automation from Harbin Institute ofTechnology in 1985. She received M.S. degree inpattern recognition and artificial intelligence andPh.D. from Harbin Institute of Technology in 1988and 1998 respectively. Currently, she is a professor inthe school of computer science and technology ofNanjing University of Science and Technology, China.Her current research interests are in the areas ofpattern recognition, robot vision, image processing andartificial intelligence.

Zuoyong Li received the B.S. degree in computerscience and technology from Fuzhou University in2002. He got his M.S. degree in computer science andtechnology from Fuzhou University in 2006. He is aPh.D. Candidate now in Nanjing University of Scienceand Technology, China. His research interests includeimage segmentation and pattern recognition.

Yong Hu is currently a Ph.D. candidate in theDepartment of Computer Science and Technology atNanjing University of Science and Technology (NUST).His research interests include image processing,computer vision, and pattern recognition.

Cailing Wang received her M.S. degree in thermal anddynamic engineering from Nanjing University ofScience and Technology. She is a Ph.D. Candidate inNanjing University of Science and Technology, China.Her research interests include image processing andpattern recognition.