quantification of facial expressions using high-dimensional

13
Journal of Neuroscience Methods 141 (2005) 61–73 Quantification of facial expressions using high-dimensional shape transformations Ragini Verma a,, Christos Davatzikos a,1 , James Loughead b,2 , Tim Indersmitten b,2 , Ranliang Hu c,3 , Christian Kohler b,4 , Raquel E. Gur b,5 , Ruben C. Gur b,6 a Department of Radiology, Section of Biomedical Image Analysis, University of Pennsylvania, 3600 Market Street, Suite 380, Philadelphia, PA 19104, USA b Brain Behavior Laboratory, Department of Psychiatry, Neuropsychiatry Section, HUP, 10 Gates Pavilion, Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA c Department of Bioengineering, University of Pennsylvania, BX 0600/RM 1710 Harrison, 3910 Irving Street, Philadelphia, PA 19104-6007, USA Received 27 February 2004; received in revised form 20 May 2004; accepted 20 May 2004 Abstract We present a novel methodology for quantitative analysis of changes in facial display as the intensity of an emotion evolves from neutral to peak expression. The face is modeled as a combination of regions and their boundaries. An expression change in a face is characterized and quantified through a combination of non-rigid (elastic) deformations, i.e., expansions and contractions of these facial regions. After elastic interpolation, this yields a geometry-based high-dimensional 2D shape transformation, which is used to register regions defined on subjects (i.e., faces with expression) to those defined on the reference template face (a neutral face). This shape transformation produces a vector-valued deformation field and is used to define a scalar valued regional volumetric difference (RVD) function, which characterizes and quantifies the facial expression. The approach is applied to a standardized database consisting of single images of professional actors expressing emotions at predefined intensities. We perform a detailed analysis of the deformations generated and the regional volumetric differences computed for expressions. We were able to quantify subtle changes in expression that can distinguish the intended emotions. A model for the average expression of specific emotions was also constructed using the RVD maps. This method can be applied in basic and clinical investigations of facial affect and its neural substrates. © 2004 Elsevier B.V. All rights reserved. Keywords: Expression analysis; Emotion intensity quantification; Shape transformation; Elastic deformations; Facial action coding system (FACS) Corresponding author. Tel.: +1-215-662-7471; fax: +1-215-614-0266. E-mail addresses: [email protected] (R. Verma), [email protected] (C. Davatzikos), [email protected] (J. Loughead), [email protected] (T. Indersmitten), [email protected] (R. Hu), [email protected] (C. Kohler), [email protected] (R.E. Gur), [email protected] (R.C. Gur). 1 Tel.: +1-215-349-8587. 2 Tel.: +1-215-662-7389. 3 Tel.: +1-215-417-8213. 4 Tel.: +1-215-614-0161. 5 Tel.: +1-215-615-3604. 6 Tel.: +1-215-662-2915. 1. Introduction Facial expression is the main means for human and other species to communicate emotions and intentions (Darwin, 1872). Such expressions provide information not only about the affective state, but also about the cognitive activity, temperament, personality and psychopathology of an in- dividual (Darwin, 1872; Schmidt and Cohn, 2001). Facial expression analysis has been increasingly used in basic re- search on healthy people and in clinical investigations of neuropsychiatric disorders including affective disorders and schizophrenia. Lacking objective methods for quantifying facial change, research in this area has focused on emo- tion recognition capabilities of patients compared to healthy 0165-0270/$ – see front matter © 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.jneumeth.2004.05.016

Upload: others

Post on 06-May-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Quantification of facial expressions using high-dimensional

Journal of Neuroscience Methods 141 (2005) 61–73

Quantification of facial expressions using high-dimensionalshape transformations

Ragini Vermaa,∗, Christos Davatzikosa,1, James Lougheadb,2, Tim Indersmittenb,2,Ranliang Huc,3, Christian Kohlerb,4, Raquel E. Gurb,5, Ruben C. Gurb,6

a Department of Radiology, Section of Biomedical Image Analysis, University of Pennsylvania,3600 Market Street, Suite 380, Philadelphia, PA 19104, USA

b Brain Behavior Laboratory, Department of Psychiatry, Neuropsychiatry Section, HUP, 10 Gates Pavilion,Hospital of the University of Pennsylvania, 3400 Spruce Street, Philadelphia, PA 19104, USA

c Department of Bioengineering, University of Pennsylvania, BX 0600/RM 1710 Harrison, 3910 Irving Street, Philadelphia, PA 19104-6007, USA

Received 27 February 2004; received in revised form 20 May 2004; accepted 20 May 2004

Abstract

neutral top cterized andq fter elastici on subjects( vector-valuedd ntifies thef ing emotionsa s computedf the averagee tigations off©

K )

clI(g

other

outvity,

in-lic re-s ofandingemo-lthy

0d

We present a novel methodology for quantitative analysis of changes in facial display as the intensity of an emotion evolves fromeak expression. The face is modeled as a combination of regions and their boundaries. An expression change in a face is charauantified through a combination of non-rigid (elastic) deformations, i.e., expansions and contractions of these facial regions. A

nterpolation, this yields a geometry-based high-dimensional 2D shape transformation, which is used to register regions definedi.e., faces with expression) to those defined on the reference template face (a neutral face). This shape transformation produces aeformation field and is used to define a scalar valued regional volumetric difference (RVD) function, which characterizes and qua

acial expression. The approach is applied to a standardized database consisting of single images of professional actors expresst predefined intensities. We perform a detailed analysis of the deformations generated and the regional volumetric difference

or expressions. We were able to quantify subtle changes in expression that can distinguish the intended emotions. A model forxpression of specific emotions was also constructed using the RVD maps. This method can be applied in basic and clinical invesacial affect and its neural substrates.

2004 Elsevier B.V. All rights reserved.

eywords:Expression analysis; Emotion intensity quantification; Shape transformation; Elastic deformations; Facial action coding system (FACS

∗ Corresponding author. Tel.: +1-215-662-7471; fax: +1-215-614-0266.E-mail addresses:[email protected] (R. Verma),

[email protected] (C. Davatzikos),[email protected] (J. Loughead), [email protected] (T.ndersmitten), [email protected] (R. Hu), [email protected]. Kohler), [email protected] (R.E. Gur),[email protected] (R.C. Gur).1 Tel.: +1-215-349-8587.2 Tel.: +1-215-662-7389.3 Tel.: +1-215-417-8213.4 Tel.: +1-215-614-0161.5 Tel.: +1-215-615-3604.6 Tel.: +1-215-662-2915.

1. Introduction

Facial expression is the main means for human andspecies to communicate emotions and intentions (Darwin,1872). Such expressions provide information not only abthe affective state, but also about the cognitive actitemperament, personality and psychopathology of andividual (Darwin, 1872; Schmidt and Cohn, 2001). Faciaexpression analysis has been increasingly used in bassearch on healthy people and in clinical investigationneuropsychiatric disorders including affective disordersschizophrenia. Lacking objective methods for quantifyfacial change, research in this area has focused ontion recognition capabilities of patients compared to hea

165-0270/$ – see front matter © 2004 Elsevier B.V. All rights reserved.oi:10.1016/j.jneumeth.2004.05.016

Page 2: Quantification of facial expressions using high-dimensional

62 R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73

controls (Sakheim et al., 1982; Salem et al., 1996). Sincebrain disorders likely affect emotional expressions, and notonly perception (Gainnoti et al., 1993; Sakheim et al., 1982),it would be of value to develop methods of expression quan-tification and analysis that can compare how emotions areexpressed by an individual suffering from a brain disordercompared to healthy controls. The problem of expressionquantification is rendered extremely challenging by individ-ual physical facial differences such as wrinkles, skin tex-ture etc. These may confound efforts to quantify differencesin expressiveness, degree of facial mobility, frequency andrate of expression, all of which could be associated withbrain disorders. A major neuropsychiatric disorder charac-terized by deficits in emotional expressiveness is schizophre-nia, where “flat affect” is a hallmark of the illness (Rinn,1984).

Methods of expression rating currently employed are timeconsuming and depend on subjective judgment of raters.We propose an efficient and objective expression quantifi-cation scheme which is reproducible, robust, automated andfast. In order to develop objective measures of changesin facial expressions, it is necessary to: (1) quantify thechange in expression as the emotion changes in inten-sity from mild to peak; (2) quantify the difference in fa-cial expression of the same emotion between patients andh us-i ex-p sted.Q facei ssionc tools.W izede wever,o nerala ssionc

theirb a facew st thet s arei elas-t jectf other( t allt rateda ionc sitieso gion-w sfor-m dif-f efor-m o thef aver-a an beu sure-m

Fig. 1. Regions demarcated in the face. B depicts one of the boundaries andR indicates one of the regions. Corresponding regions are painted with thesame color. (a) is the neutral face chosen as the template and (b) is the happyface taken as the subject.

1.1. Literature survey

Facial expressions have been investigated as a tool for un-derstanding the regulation of emotions in health and diseaseand investigate its neural substrates. Facial expression anal-ysis consists of two sub-problems:expression recognitionandexpression quantification. Each of these requiresfacialmodeling. Expression recognition involves classifying the ex-pression as one of the several possible emotions (Ekman andRosenberg, 1997). In expression quantification, the intensityof the emotion needs to be quantified on a region-wise ba-sis, to understand how much each region contributes to anexpression as well as to its intensity. Automatic expressionanalysis has attracted attention in computer vision literaturebecause of its importance in clinical investigations, but theefforts have been focused on expression recognition (Bartlettet al., 1999; Black and Yacoob, 1997; Cohn et al., 1999; Essaand Pentland, 1997; Lien et al., 2000; Tian et al., 2001;Terzopoulos and Waters, 1990; Yacoob and Davis, 1994;Zhang, 1999). We propose a powerful method for expres-sion quantification that is applicable region-wise and can beextended to automate the recognition of facial expressions.

Expression recognition methods can be categorized onthe basis of the data used for the analysis (single images:Martinez, 2002; video sequences:Essa and Pentland, 1997;L vis,1 ing.B u-l og-n andt f val-u e arem asedm thods,we a-c s arec ts aret a-t d the

ealthy people; (3) construct a model of an expressionng healthy people as a standard against which theression of patients with different disorders can be teuantification of fine-grained structural changes in the

s necessary to capture the subtlety of human exprehanges, and this requires advanced morphometrice created a model for each of four universally recogn

xpressions—anger, fear, sadness and happiness. Hour proposed method of expression quantification is gend can be applied to any face undergoing an exprehange.

We treat the face as a combination of regions andoundaries. A neutral face is chosen as a template andith expression, to be compared to and quantified again

emplate, is chosen as a subject. Corresponding regiondentified on each of these faces. We then compute anic transformation which warps the template to the subace, mapping the corresponding boundaries to eachDavatzikos, 2001). These regions are chosen such thahe distinctly identifiable features of the face are sepand accounted for (seeFig. 1). Such a shape transformatan be defined for each expression and at varying intenf expression. The differences can be analyzed on a reise basis by comparing the corresponding shape tranations. These transformations quantify the volumetric

erences among faces with expression. The resultant dation field characterizes how the neutral face deforms t

ace with expression. These deformation fields can beged to generate a model for each expression, which csed for the visual presentation of the quantitative meaents of expression.

ien et al., 2000; Tian et al., 1999, 2001; Yacoob and Da996) and the methodology adopted for facial modeloth 2D and 3D models (Karpouzis et al., 2004; Terzopo

os and Waters, 1990) have been applied. Expression recition requires modeling the face in terms of features

hen coding each expression as a unique combination oes of these features. In facial feature extraction, therainly two types of approaches: geometric feature-bethods, which are local, and appearance-based mehich are global. In geometric feature-based methods (Yuillet al., 1992; Zhang, 1999), a large number of landmark fial feature points are extracted, on which the featureomputed. In video-based methods, these feature poinracked (Black and Yacoob, 1997). The values of these feures are used as an input to a classification system an

Page 3: Quantification of facial expressions using high-dimensional

R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73 63

output is one of the pre-selected emotion categories. Thesemethods differ in the features extracted. Identifying the fea-ture points is especially difficult for regions of the face suchas cheeks, forehead and chin, the deformations of which playan important role in expression analysis but do not have dis-tinguishable feature points. Furthermore, these unconnectedfeature points do not have enough spatial information de-tail to account for the elastic changes that take place in eachof the facial regions as the expression changes. Hence, theyare unsuitable for expression quantification, being unable toproduce the dense deformation field required for facial ex-pression quantification. In appearance-based global methods(Martinez and Kak, 2001) the whole face is used to defineregional appearance patterns. Image filters such as Gaborwavelets are applied to the whole face or to specific regionsin a face to extract the feature vectors. The shape of the wholeface is modeled using a principal component analysis (PCA)on these feature vectors or an alternate approach (Martinezand Kak, 2001), to reduce the number of features used todefine a face. These methods do not require the a priori def-inition of a large number of feature points and are global innature. Yet they use information from the whole face as asingle feature, and hence are unable to account for the fine-grained, subtle changes in the facial regions as required forexpression quantification.

thesep dings -i sci-e rdinga al fa-c units( s re-s CS,e acti-v ede hicha ,1 andPa ident eeksa aretD resc ,1a es n unita esea e thef alsob uan-t ces.I atom-i ;

Terzopoulos and Waters, 1990) or by using multiple views(Gur et al., 2002). These models are used to study the changesin each of the facial regions (Indersmitten and Gur, 2003).

Such methods address expression recognition as a clas-sification problem. However, for expression quantificationthe faces need to be compared on a region-wise basis aseach region may uniquely contribute to an expression. Facesare complex combination of regions that deform (expandand contract) non-rigidly and elastically as the expressionchanges. Expression recognition methods have concentratedonly on some specific facial regions and these methods fallshort of characterizing the full elastic deformation betweentwo faces. In feature-based methods, it is computationally ex-pensive to compute and analyze features at all points, and forglobal methods there is insufficient information to quantifyexpressions on a region-wise basis. For successful quantifica-tion, we need a deformable registration capable of providingmore than simply a combination of rotation, translation andscaling, but rather a technique that will map one face to theother non-rigidly (not just a combination of rotation, trans-lation and scaling) and produce a dense deformation fieldwhich accounts for all points on the face.

Several techniques that afford quantitative morphologicalanalysis of complex structures such as the brain and heart,which transform non-rigidly over time, have been developedi ndE ss las-t se tou modelf wer-f rac-tK 96T har-a on-t tions( or-p efer-e andp on-l nal-y apeo tionst 6;T mya can-n andf trans-f sole sion.

1

rans-f face

Some investigators have attempted to overcomeroblems by basing their approach on the facial action coystem (FACS) (Ekman and Friesen, 1978). FACS, the leadng method for measuring facial movement in behavioralnce, is an anatomically based coding system for recoppearance changes caused by the activation of individuial muscles. The measurement units of FACS are actionAUs), which are described in terms of muscular actionponsible for each facial movement. Therefore, using FAach facial expression can be described in terms of theated AUs. FACS is performed manually by highly trainxperts. There are several automated versions of FACS, wutomate the process of studying facial action (Bartlett et al.999; Cohn et al., 1999; Donato et al., 1999; Essaentland, 1997; Lien et al., 2000; Tian et al., 2001). Thepproach of all these systems is video-based. Features

ified on the face (e.g. outline of lips, eyes, eyebrows, chnd furrows) in the first frame of the video sequence,

racked through the video using optical flow (Yacoob andavis, 1996) or other tracking techniques. These featuan be modeled as a collection of feature points (Cohn et al.999; Tian et al., 2001) or deformable templates (Terzopoulosnd Waters, 1990; Yuille et al., 1992). The change in thtates of these features is used to determine the actioctivated, which in turn identifies the emotion. Most of thpproaches use a manual pre-registration step to initializ

eatures as they vary from expression to expression andetween individuals. These methods do not attempt to q

ify the difference of the same emotion between two fan the 3D approach, the faces are modeled using an ancal basis of their muscle structure (Karpouzis et al., 1999

-

n medical imaging (Christensen et al., 1997; Collins avans, 1996; Maintz and Viergever, 1998). These techniquepatially normalize, i.e., elastically register deformable, eic body parts of a subject to those of a template. We propose one such technique called shape transformations to

acial deformations. Shape transformations provide a poul way to obtain a detailed and localized structural chaerization of complex objects such as the brain (Bajcsy andovacic, 1989; Davatzikos, 2001; Davatzikos et al., 19).he principle of this approach is that morphological ccteristics of two individuals or populations can be c

rasted by comparing the corresponding transformaThompson, 1917). As these shape transformations map mhological characteristics of a subject to a common rnce system of a template, they facilitate comparisonsooling of data from various sources. Their highly n

inear nature makes them optimal for facial expression asis and gives them high flexibility in adjusting the shf one region to another. However shape transforma

hat are merely intensity-based (Collins and Evans, 199hompson, 1917) do not account for the underlying anatond the geometrical shape of facial regions, and henceot be applied for large uniform regions such as cheeks

oreheads. We therefore define a geometry-based shapeormation (Davatzikos, 2001; Davatzikos et al., 1996; Subt al., 1996) between neutral faces and faces with expres

.2. Our work in perspective

In our expression quantification method, the shape tormation maps a neutral face taken as a template, to a

Page 4: Quantification of facial expressions using high-dimensional

64 R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73

with expression, which is the subject. Our shape transforma-tion is based on point correspondences determined on distinctregion boundaries along with some landmark points, demar-cated on the subject and template face. We compute an elastictransformation, which warps the template to the subject face,mapping the corresponding boundaries to each other and elas-tically interpolating the enclosed region (Davatzikos, 2001).Comparison of the properties of these shape transformationshelps to quantify recognized emotions. Our specific contri-butions are:

• We present a method for quantifying expression changesbetween various intensities, e.g. mild, medium and peak,and across individuals expressing the same or differentemotion, by using high-dimensional shape transforma-tions. The current literature on face expression recognitionis able to recognize only peak emotion, i.e., when all theaction units are activated, and not grades of that emotion.

• We obtain regional volumetric difference maps using theshape transformations, which quantify the deformation be-tween two faces. We design a model for average of eachexpression based on these regional volumetric differencemaps. Such models designed using expressions of healthypeople are useful for examining individual differences andin clinical investigation, as they can be used as guides fora visual-based expression analysis. Such analysis, in con-

ar-Weg thetion.will

• o notethodusedgesaseduiretrack-

• at all

thes aps.E mon-sa d. InS den-t

2

seda com-p

2.1. Approach

We choose a neutral face (face with no expression) as atemplate. All the other expressions of the same individual arecategorized as subjects, to be analyzed against this template.A subject face is one that has expression. The expressioncould vary in intensity from mild to medium to peak.

Our general approach can be outlined in the followingsteps:

1. Identify regions on the template that characterize the vari-ous facial features. These regions are demarcated by mark-ing their boundaries (cf.Fig. 1). Some landmark points arealso marked; curve segments in between are parameter-ized via constant-speed parameterization.

2. For each region picked on the template, identify the cor-responding region on the subject.

3. Compute the elastic transformation from one face to theother, so that the demarcated regions of the face from thesubject are mapped to their counterparts in the template.

The shape transform provides us with the information re-garding the deformation produced by the expression change.Also, we define a regional volumetric differences function,which provides a numeric value for each pixel on the face.This value quantifies the expansion and contraction of the re-g tailso

2t

ptedft

Ω

uniono

F

F

w ingr

mar-c o tof ariess aram-e venly

junction with clinical methods, can be helpful for compison of expressions of individuals with brain disorders.can also predict the expression of a neutral face usindeformation field generated by the shape transformaThe shape transformation and the quantification mapbe described in detail inSection 2.We can perform our analysis on single images and dneed video sequences. This makes the proposed mwidely applicable. Existing standardized databasesin clinical investigations, mainly consist of single imaand hence cannot be analyzed with any of the video-btechniques available in the literature, which also reqchanges between subsequent frames to be small foring to be possible.Our method is general and applicable to all emotionslevels of intensity.

In Section 2, we describe the method of computinghape transformation, and creation of the quantification mxperiments carried out to validate the method and detrate its applicability are described inSection 3. Section 3lso describes the implementation details of our methoection 4, we discuss the results of the experiments and i

ify future directions that we propose to follow.

. Methodology of expression quantification

In this section, we first give an overview of our propopproach. We then detail the mathematical design andutation of the shape transformation.

ion on a pixel-wise basis. Additional implementation def the method are described inSection 3.2.

.2. Regional analysis of the face through the shaperansformation

The shape transformation used in this paper is adarom the work ofDavatzikos (2001). Let Ωs andΩt denotehe subject space and template space, respectively:

t = F t|Face with neutral expression

Ωs = Fs|Face with expressions at varying intensities

(mild, medium and peak)

Now we define each template and subject face as af regions and their boundaries.

t = ∪i(Bi

t, Rit)

s = ∪i(Bi

s, Ris)

hereBit andBi

s are the boundaries of the correspondegionsRi

t andRis, respectively.

Fig. 1 shows some of the regions that have been deated on the face. Some landmark points (typically twour) are also selected on the boundaries. The boundegments in between two consecutive landmarks are pterized by a constant-speed parameterization, i.e., by e

Page 5: Quantification of facial expressions using high-dimensional

R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73 65

spaced points (after discretization). This parameterization ef-fectively sets up point correspondences between the bound-aries of the subject regions and the boundaries of the corre-sponding template regions. These point correspondences canbe used to define a transformation between the template faceand the subject face. In order to determine a shape transforma-tion that accounts for the morphological or elastic changesoccurring in the enclosed regions, we compute the elastictransformationSwhich:

• maps the corresponding points between the boundaries ofthe template to their counterparts on the subject boundariesand

• warps the enclosed regions from the template to the sub-ject, elastically.

We adapt the shape transformation computed inDavatzikos (2001)and compute the elastic shape transfor-mationS:

S : Ωt F t → Fs∈ Ωs

such that

• the boundaries are mapped to each other,f : Bit → Bi

s, ∀i

by mapping the corresponding points on the boundariesand

• to the

ts fort ry buta sfor-m face,F sw e ba-s ions.I longa stent,t , wec n.

(onev -v lt oft Eachv s-p ateu sion.O te( lo pixelo on oft afterum ndi

tiont

• The scalar field of values of the regional volumetric dif-ference function (RVDF), which is evaluated at each pixelon the face. We define the RVDF as:

RVDF(s) = det(∇S(s)) = determinant of the Jacobian of

Sevaluated at each pointson the subject

The RVDF value quantifies the volumetric difference (vari-ability in expansion and contraction) between regions. Themap containing the RVDF values for each pixel of theface is called theRVDmapof the face. Various inferencesmay be drawn from the values of the RVD function. IfRVDF1(u) ≥ RVDF2(u) for the same region on two faceswhich have the same expression, at the same intensity,then it quantifies that the region in face 1 has deformed(expanded or contracted) more than the region in face 2,relative to their respective neutral states. It, therefore, quan-tifies the variability in expressing the same emotion acrossindividuals. If this is on the same face and same region, itindicates and quantifies the change in expression.Fig. 2(c)shows a color visualization of an RVD map.

• The vector displacement fields of thedeformation. Thesecharacterize the direction and degree of movement of eachpixel of the face during the course of an expression change.These vectors can be used to quantify temporal changes inan expression.Fig. 2(d) shows the deformation of each

sion

Fig. 2. Information obtained from the shape transformation: (a) templateface, (b) subject face with expression, (c) intensity normalized RVD map,(d) vector deformation field and (e) color map for visualization of RVDmaps.

the enclosed template regions are elastically warpedenclosed regions on the subjectRi

t → Ris, ∀i.

ThusSelastically matches the boundaries and accounhe elastic (deformable) changes not only on the boundalso within the enclosed region. The resulting shape tranationSquantifies the shape properties of the subject

s with respect to the template face,Ft. Therefore, two faceith different emotion expression can be analyzed on this of a point-wise comparison of the shape transformatn the method, the regions can be overlapping regions; ass the constraints on the driving points/curves are consi

hey will be interpolated to a dense vector field. Howeverhoose non-overlapping regions for ease of interpretatio

The shape transformation produces a map of vectorsector at each pixel) called avector field. These vectors proide the direction of movement of the pixel as the resuhe deformation caused by the change in expression.ector is denoted by a 2-tuple (dx, dy) which denotes the dilacement inxandydirection that each pixel on the templndergoes, when it is transformed to the face with expresn adding the vector (dx, dy) to the position of the templax, y), we get the deformation (x+ dx, y+ dy) which the pixen the template undergoes. The position to which then the neutral template moves to as a result of the acti

his vector denotes the deformed position in the subjectndergoing an expression change. This is known as thedefor-ation field. Additional details of the method can be fou

n Davatzikos (2001)andDavatzikos et al. (1996).We obtain two quantities from the shape transforma

hat we use for our analysis:

pixel of the template face 2(a) as a result of the expreschange from 2(a) to 2(b).

Page 6: Quantification of facial expressions using high-dimensional

66 R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73

In this paper, the quantification measure will be the RVDmap of the shape transformation of a face with respect to atemplate. However the vector deformation field can be used toprovide additional information about the direction of move-ment of the regions. Implementation issues of the algorithmwill be discussed in detail inSection 3.2.

3. Experiments and results

We have carried out experiments in facial expression anal-ysis to highlight the features of our approach, by applying ourmethod to a database created for the purpose of studying ex-pression related disorders (Gur et al., 2002). The experimentsare not meant to be exhaustive, but focus on the applicabil-ity of our approach in the investigation of affect processing.They demonstrate the flexibility and power of our method toaid in the diagnosis of disorders of emotion processing.

3.1. Database used in study

In order to conduct expression analysis research, investi-gators have collected databases of emotion, which can con-tain either posed or evoked expressions (Tian et al., 2001).Posed databases have been acquired using posers (actors andn ei eivedt eo basew res-sa

bseto renia( orsw d ex-p r, disg phedw onalt usingb ande pro-c ere ins eratedt threep andp mmS d at1 nnedi n befT ze oft r wed

malea 8 and

72 years (average 39) portraying expressions of happiness,sadness, fear and anger. Disgust was not included due topoor recognition in the validation of the full database (Guret al., 2002; Kohler et al., 2004). The intensity of expres-sions ranged from mild to peak. The validity and intensity ofthe emotions expressed were established by controlled rat-ing procedures. The images were evaluated by expert raters(N = 8) to ensure that the ease of recognition of the targetemotion and intensity level were comparable for each imageincluded in the subset. The raters were required to evaluatethe intensity of each expression as mild, medium or peak,as well as classify each facial expression as being of anger,fear, happiness or sadness. Percentage values were calcu-lated for each face with expression with respect to the inten-sity and the expression category. All faces whose expressionwas identified as being of the correct category of emotion bymore than 60% of the raters, were chosen for analysis usingour method. Those with higher than 60% score on intensitywere chosen as peak expressions for analysis. The thresh-olds were kept low so that greater variability is incorporated.These controls are essential as evaluation of an expression isvalid only if the intended expression is perceived correctlyby a large proportion of healthy research participants. Theaim of this rating was to determine a subset of facial expres-sions to be used in our analysis, the responses from whichc baseh dG rt(

3

ithmw ther

3sion

b us-i e canb ssionp ssionc plate.I era,w lly,t two-t e sot platei ge oft rmedt 0–40,t xper-i bothg faceso thiso

on-actors) (Ekman and Rosenberg, 1997). However, thers evidence that posed emotions are less accurately perchan evoked expressions (Gur et al., 2002). Therefore, we usnly evoked expressions for our analysis. Such a dataith posed and the more difficult to acquire evoked expions, has been described and validated (Gur et al., 2002),nd we selected a subset of the available stimuli.

Our methodology is tested in this paper using a suf the database created for investigations of schizophKohler et al., 2004). For this database, a group of actere photographed while portraying posed and evokeressions of the emotions of happiness, sadness, feaust and anger. Each of the actors was initially photograith a neutral face. They were then guided by professi

heatre directors through enactments of each emotionoth posed (mechanical, the “English” acting method)voked (the Stanislawski or “Russian” acting method)edures. To generate evoked expressions, the actors wtructed to remember a past experience which had genhe same emotion. Expressions were photographed atredetermined levels of emotion intensity: mild, mediumeak. The images were captured using a Nikon N9035LR camera (with Kodak Ektachrome 320T film, expose/60 s and f/5.6). The images from this camera were sca

nto the computer off-line. The details of the method caound inGur et al. (2002). The images are of size 1024× 768.he computational algorithms are independent of the si

he image. The images were acquired in color, howeveo not require color for our algorithm.

For this study, we selected evoked images from 30nd 30 female Caucasian actors between ages of 1

-

-

an be validated by the raters. Validation of the dataas been performed inGur et al. (2002), Indersmitten anur (2003), Kohler et al. (2004)andPalermo and Colthea

2004).

.2. Implementation issues

In this section we present some details of the algorhich play an important role in the implementation and

esults produced for analysis.

.2.1. Choice of templateIn order to be able to compare the difference in expres

etween two individuals, the RVD maps are computedng the same neutral face as the template face. Any face used as a template. However faces with neutral exprerovide a natural base against which any change of exprean be compared. Hence we use a neutral face as a temn addition, the face must be completely facing the camith no sideways or forward/backward head tilt. Optima

he size of the head should be average, i.e., occupyinghirds of the image and should fit completely in the framhat the outer boundary can be clearly marked. The tems the neutral face of a person who is in the average ahe population under study. In our case we have perfohe analysis on several templates in the age range of 3he average age of our participants being 39. We have ementally found that the same template can be used forenders. We did not have enough data to test this onf different ethnic groups, but we propose to incorporatence additional data is available in the future.

Page 7: Quantification of facial expressions using high-dimensional

R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73 67

As the neutral template face may be of a different indi-vidual, than the subject face with expression, the indicationsof expansion and contraction may be due to the fact that thesize of the face used as a template differs highly from thatof the subjects. To obtain consistent results, we normalizethe RVD maps obtained for the subjects, by using the RVDmap of the neutral face of the individual chosen as subject.This is explained in detail in the experiments associated withFigs. 7 and 8.

3.2.2. Choice of regionsThe choice of regions does affect accuracy. In densely

sampled regions, the estimates of the deformation will bemore accurate, although this depends on how complex thedeformation itself is. However, dense sampling of regionsalso increases the human error possible in outlining these re-gions. We chose the regions so that they cover the whole faceand are mutually exclusive and non-overlapping. They werecarefully drawn so that they correspond to natural lines onthe face, or to clearly identifiable markers on the face, suchas the hairline, outline of lips, nose, eyebrows and eyes, etc.The aim of this choice was to facilitate repeatability and toavoid, as much as possible, inter-user variability and the hu-man error involved in identifying and marking these outlinesand landmark points.

siona to thec n andc un-l thisc gion.F w ar How-e prett llowb uldh rows on oft eser on int izedw rel-a n theo

im-p de-m onee re-g onsp

3the

b pointst e twof eter

ized using constant-speed parameterization. This parameter-ization is used to establish full correspondence between theoutlines. We have experimentally found that up to four land-mark points are sufficient for the size of the contours on theface. In case of larger contours, larger number of points maybe marked, however this increases the chance of human error.Similarly for smaller size contours, we use lesser number oflandmarks.

3.3. Experimental analysis

We apply our approach to the database described above.Fig. 1shows some of the regions that have been demarcatedon the faces. Our method is flexible as these regions may bealtered depending on the analysis requested. The algorithmwas tested on images from other databases acquired under dif-ferent settings and different lighting conditions. It performedwith accuracy as long as boundaries could be clearly demar-cated.

In Fig. 2, we show the information that is produced bythe shape transformation.Fig. 2(a) shows the template neu-tral face andFig. 2(b) shows the corresponding subject faceexpressing fear. We then compute the shape transformationthat elastically warps the regions demarcated on the templateto the corresponding regions identified on the subject (seeF ex-p tion.T theseR liza-t p. Ino s theb atingn olorm -c f thef -i thes es thec hesem emo-t bluesi ex-p hep a) to2 f thet sitiono d asa pro-d ewa pixela

3nt of

f as a

We selected non-overlapping regions for facial expresnalysis, in order to best interpret regional changes duehange in facial expression. We found that the expansioontraction of overlapping regions were not meaningful,ess one region was completely contained in the other. Inase, the smaller regions are a constraint on the larger reor example, lips within the outline of the face, shows hoegion within the face changes when the face changes.ver, if the lips and the nose intersect, it is difficult to interhe individual changes. Although smaller regions may aetter quantification, for example, dividing the brow woelp identify changes in the inner brow and the outer beparately, however in the absence of natural subdivisihe brow, we will increase the human error in identifying thegions, making the analysis of expansion and contractihese regions difficult. Also the regions are not normalith respect to the size of the neighboring regions, theirtive size differences do not weight one region more thather.

The regions defined on the templates are fixed. It isortant to identify the regions manually as these regionarcations may vary between individuals and also from

xpression to another. The ability to manually identify theions also provides the user with flexibility to alter the regiicked.

.2.3. Choice of landmark pointsTwo to four landmark points need to be marked on

oundaries outlining the regions. These serve as seedo establish correspondence between the outlines on thaces. Curves between these landmark points are param

-

ig. 1 for regions). A positive RVD value indicates anansion and a negative RVDF value indicates a contrachese are the values used in the analyses. However,VD values are normalized to a specific range for visua

ion of the expression changes in the form of a color maur case, we choose the range to be 0–90, as it provideest demarcation. In doing so, the base value of 0, indico change, is shifted to 30. The range for displaying the cap can be changed by the user.Fig. 2(c) shows the color

oded RVD map of RVD values computed at each pixel oace, the color map for which is inFig. 2(e). After normalzation, an increase in RVDF values from the template toubject indicates the expansion and a decrease indicatontraction of the corresponding region in the subject. Taps are computed at varying intensities of the same

ion, to study expression changes. In general, darkerndicate contraction and yellow to red depicts increasingansion.Fig. 2(d) depicts the deformation field indicating tixel movements due to the expression change from 2((b). In order to create 2(d), we represented the pixels o

emplate face as a grid of the same size. Then to each pof the grid, the displacement of the vector field produceresult of the shape transformation, was applied. This

uces the deformed grid shown inFig. 2(d). In this paper, will use RVD maps (as shown inFig. 2(c)) for quantifications these provide a numeric value of the changes at eachs a result of the expression change.

.3.1. Regional changes and movementsThe RVD maps can be used to determine moveme

acial regions. This is achieved by studying the region

Page 8: Quantification of facial expressions using high-dimensional

68 R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73

Fig. 3. Quantification of intensities of happiness.

combination of expansion and contraction of several regions.In Fig. 2(c), the forehead contracts and the upper lid andthe region between the eyes expand. This indicates that theeyebrows have been raised. The mouth and upper lip expand.This in conjunction with the fact that the chin and cheeksexpand slightly and lower face expands, indicating a jaw drop.The actual expansion and contraction of these regions and thedirection of their movement can be verified with the actualchanges in face in 2(a) and (b). Other changes in the face canbe determined as a combination of expansion and contractionof several regions, as will be explained forFigs. 3–5.

We now describe the detailed experiments performed totest various aspects of the approach.

3.3.2. Quantification of expressionsIn Figs. 3–5 we show the RVD maps for the

expressions—happiness, sadness and anger. The RVD val-ues for the expression of fear has been shown and discussedin Fig. 2. The RVD maps are color-coded on the basis of thecolor map shown inFig. 2(e). The regions identified on thefaces have been shown inFig. 1.

In Fig. 3, we quantify three intensities of happiness. Part(a) shows the face with neutral expression and is the templateand parts (b)–(d) are the three subjects representing three in-

aps.

Fig. 5. Quantification of three intensities of anger.

tensities of happiness. The RVD maps shown inFig. 3(e)–(g)depict the pixel-wise RVD values which indicate expansionand contraction. The color map is the same as inFig. 2(e). Inthis case, the eyes contract, the mouth expands and the cheekscontract indicating a sideways expansion of the mouth. Thelower lids expand depicting a cheek raise. The forehead showsno change and neither does the region between the eyes andthe eyebrows. The contraction is indicated by a decrease inRVD values from the template to the subject images. Theexpansion of the regions is indicated by the increase in RVDvalues from the template to the subjects.

Fig. 4 shows the sadness in varying intensities.Fig. 4(a)shows the template neutral face and parts (b)–(d) show thesubject faces showing various intensities of sadness. Theeyes contract, the region between the eyes and the eyebrowsexpand and the forehead contracts, indicating an eyebrowraise, the upper lip expands, the mouth expands and thecheeks contract. The chin expands and the whole face con-tracts, indicating a sideways expansion of the lips and thechin.

In Fig. 5, the same analysis is carried out for the expressionof anger.Fig. 5(a) shows a neutral face and parts (b)–(d) showthe varying intensities of anger. The second row shows thecolor-coded RVD maps for each of the four images. In thiscase, images 5(b)–(d) are the subjects, to which the templatei thec ee tweent eeksc aise.T rom( re isa ting ac ageso

d isa ifieda

Fig. 4. Quantification of various intensities of sadness using RVD m

n 5(a) is elastically warped. The RVD values of pixels inorresponding regions in the faces inFig. 5 indicate that thyes contract, the forehead expands and the regions behe eyes contract, indicating a brow lowering. The chontract and the lower lids expand, indicating a cheek rhe continuous increase in RVD values of the mouth fe) to (g) shows that the region continues to expand. Thecontinuous decrease in RVD values of the eyes indicaontraction. This can be validated against the actual imf the expression in 5(b)–(d).

The comparison with FACS reveals that our methoble to identify changes in regions which are also idents being activated by action units by FACS raters (Kohler et

Page 9: Quantification of facial expressions using high-dimensional

R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73 69

al., 2004). A full “action-unit based” comparison is beyondthe scope of the present work.

3.3.3. Stages of deformation leading to the peakWhile computing the shape transformation for each of

the expressions, we also obtained the deformation fields be-tween the neutral template and the faces with expressions. Bystudying the deformation fields, we can analyze how the facedeforms in stages from the neutral to the peak. This can beseen inFig. 6, which shows some of the intermediate stages(frames) of a face transforming from a neutral face to anangry face. These intermediate facial expressions are gener-ated by interpolating the deformation fields produced by theshape transformation mapping the neutral face to three inten-sities of expressions of the same emotion. The interpolationis computed to emulate 25 frames/s, i.e., the increments inthe displacement are generated such that the change in ex-pression is at video rate. The intermediate expressions canbe used for normalization of expressions and also used forpredicting the expression of a person.

3.3.4. Effect of choice of initial templateThe template is used as a measurement unit on which all

the experiments are based. In order for the RVD maps to becomparable for analysis across individuals, it is important tos sults.T , theR eaks pres-se Wehta ces in7 from7 heR con-t rentn sure-m platec t val-

m the n

Fig. 7. Effect of change of template.

ues in different metric systems. However, relative magnitudesshould not change, to the extent that there is registration er-ror between the template and the subject. This means thatthe relative expansion and contraction of the regions do notchange. This is demonstrated inFig. 7. The forehead, mouthand lower eyelids expand. The eyes, upper eyelids and areabetween the eyes contract. The expansion of the forehead andcontraction of the upper eye region, indicate a lowering of theeyebrows. The triangle between the eyes contract, indicatinga knitting of brows. The cheeks expand on one side and con-tract on the other, indicating a facial contortion. The changein the cheeks is partially due to the turn of the head, howeverthe difference in deformation between the cheeks is evidenteven in the presence of a head turn. All these deformationsare typical of an angry face, and are highlighted even with achange in template.

The choice of template presents another issue, that if thetemplate used is of an individual different from those chosenas subjects, the expansion and contraction evident in the RVDmaps could be due to the difference in the sizes of the facialfeatures between the subject and the template. InFig. 8, thetemplate 8(a) and the subject 8(b) are of two different indi-viduals. The RVD map of (b) with respect to (a) is shown in(c). However, the expansion and contraction shown in (c) canbe due to the difference in size of these regions between the

how that the choice of the template does not affect the rehis means that irrespective of the neutral face chosenVD map of the change of expression from neutral to phows the same region-wise changes for the same exion. This has been discussed mathematically inDavatzikost al. (1996)for shape transformations used for the brain.ave conducted two sets of experiments (cf.Figs. 7 and 8)

o demonstrate the practical feasibility of the claim.Fig. 7(b)nd (e) show the peak anger expressions for neutral fa(a) and (d). The RVD map of the shape transformation(a) and (b) is shown in 7(c) and from 7(d) to (e) is in 7(f). TVD maps show that similar patterns of expansion and

raction are obtained even when we use completely diffeeutral faces as in 7(a) and (d). The values of the meaents of expansion and contraction change, as the tem

hanges, since the area of a region would have differen

Fig. 6. Stages of deformation fro

eutral to the peak expression of anger.
Page 10: Quantification of facial expressions using high-dimensional

70 R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73

Fig. 8. Correction of RVD map using the corresponding neutral face.

individuals, as can be seen in the neutral face of the subject,shown in (d). In order to correct the RVD map in (c) for thesesize differences of the facial features, we compute the RVDmap of (d) with respect to (a), as shown in (e). We then usethe shape transformations corresponding to the RVD maps toobtain the corrected shape transformation and hence the RVDmaps. The corrected RVD map is shown in (f). In (f), we cansee that there is a contraction in eyes and the inner eyebrowsand the region between the eyebrows. Although there is anexpansion in the region above the eyes, as seen in (c) and (e),this was due to the difference in size between the two facesand it is eliminated in the neutral in (f), except in the areaimmediately between but not above the eyebrows. There isan expansion in the mouth even after the correction.

As has been explained inSection 3.2, we use a neutral faceas a template for all our analyses. The face is of an individualwho is approximately the average age of the group understudy (39 years for our database). The template should bedirectly facing the camera.

3.3.5. Average expressionsOur approach can be used to model the average expression

for each emotion. This can be obtained by averaging (1) thedeformation vector fields and (2) the RVD maps generatedby the shape transformation, which are the two forms of in-f , ase lare rma-t samet mayb naly-s ages

that were perceived to be of peak expression by the raters (seeSection 3.1). This significantly reduces the number of imagesthat we have available for averaging. The vector deformationfields generated by these shape transformation are averagedusing Procrustes techniques in order to produce an averagedeformation for each of the expressions. That is, we averagethex andy displacements of each of the pixels and producevalues which indicate the average deformation of the pixelson a template. The average vector field is applied to two neu-tral faces (9(a) and (c)).Fig. 9(b) shows the average angryface obtained, when the average deformation correspondingto anger is applied to the neutral face in 9(a). 9(c) and (d) showa neutral template and the corresponding average happy face,respectively. The black regions show the regions of expansionof the neutral face, either a muscle stretch or the appearanceof features like teeth, the evidence of which was absent in theneutral face, and whose appearance produces a black patch.

The RVD maps for each of these shape transformationsare averaged to produce the average RVD values for eachexpression. The average RVD maps for the emotions of hap-piness, sadness, anger and fear in men are shown inFig. 10.The average RVD maps indicate the regions of expansionand contraction for each expression—happiness (10(a): themouth and the region between the eyes expands and cheekscontract indicating sideways stretching, eyes contract, thel rais-i andc tracta roww etch-i n thee egion

ormation that we obtain from the shape transformationxplained inSection 2. We take all the images of a particumotion and intensity and compute the shape transfo

ions that map each of these to a template face. Theemplate (neutral) face is used for the whole analysis. Ite noted that we can use all faces for independent ais, however for computing the averages, we used the im

ower lid expands and the cheeks contract indicatingng of the cheek); sadness (10(b): uneven expansionontraction of the face indicating contortion, eyes connd forehead contracts indicating a lowering of the bhile chin and mouth expand, and there is sideways str

ng); anger (10(c): the mouth opens, the region betweeyes and the brows contract and the eyes contract, the r

Page 11: Quantification of facial expressions using high-dimensional

R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73 71

Fig. 9. Average expression deformation vector fields. (a) and (c) are neutral faces; (b) is the average anger deformation field of male actors applied tothe neutralface in (a); (d) is the average happiness deformation field applied to the neutral face in (c). The black dots indicate pixels which have evolved due to expansion,which were not present in the neutral face.

between eyes and the nose contract indicating wrinkling ofthe nose); fear (10(d): the mouth opens wide, the eyes and theregions above them expands, the overall face expands indi-cating a jaw drop). The average deformed faces give a goodindication of the regions of the face which are undergoing achange. These are some of regions in which change is alsoindicated by FACS (Ekman and Friesen, 1978; Kohler et al.,2004). As the number of samples is small and there is a largevariability in the expressions, some subtle differences are notevident in the averages. Additional information can be ob-tained through the individual’s RVD map (as inFigs. 2–6)and the direction and extent of motion of each facial regioncan be obtained through the vector field of deformation (cf.Fig. 2).

3.3.6. Point-wise t-test on the RVDF images to studyregions of significant expansion and contraction

We have carried out point-wiset-test between men andwomen, for the expression of anger, to identify the regionswhich show significant deformation difference between thetwo groups. We generated RVD maps for facial expressionsof anger of 15 men and 10 women. Point-wiset-test wascarried out on these RVD maps. Thet-test was in the directionof women to men because the RVD maps of the individualfaces, as well as the group averages, indicated that there wasl menttv re isa egionb lowere tt mena r lipa

As the number of images for peak expression of sad-ness and fear were too few after rating by healthy observersto make comparison meaningful, no comparison was per-formed. Also the variability is very high between these im-ages of sadness and fear and a statistical comparison willnot be meaningful here either. The experiments we have per-formed are indicative of the tests which can be conducted onthese RVD maps when the number of images available foreach gender and expression is large.

3.3.7. PCA to show that expressions can bedistinguished

We have carried out PCA (Duda et al., 2001) on the RVDmaps of the images of the four expressions for all the maleactors, accepted after they were rated as peak expression ofthe emotion that was enacted. We have used 11 images forhappy expression, 6 for sad, 8 for fear and 15 for anger. Eachof these groups includes one image which was rated to beof low intensity by the raters. This image was not used fortraining, but only for testing. We cannot combine the ex-pressions for the male and female actors, as there is a largevariability between the groups. We formed four classes, onepertaining to each emotion, by training on the RVD mapsof the images available for that expression (except the onerated low) and used the leave-one-out paradigm to check allt theM ifiedc tly;h lasso sadi ree,

F andw

arger deformation (expansion and contraction) in wohan in men. The results for these are shown inFig. 11, wherehe regions marked with white are the regions with lowp-alue (<0.01). As seen in women relative to men, thegreater contraction of the forehead, nose and the r

etween the eyes, and a greater expansion of the mouth,yelid, upper lip and cheeks. The lowp-values indicate thahe significant difference in the deformation betweennd women is restricted to the forehead, mouth, uppend lower eyelid regions.

Fig. 10. Mean of expressions for male actors.

he images (including the one rated low intensity), usingahalanobis distance. All the 11 happy faces were class

orrectly. The one with low intensity was classified correcowever the difference in voting for classification to the cf anger and happy was very small. Three out of the six

mages were correctly classified, but of the remaining th

ig. 11. p-values of difference between expressions of anger in menomen.

Page 12: Quantification of facial expressions using high-dimensional

72 R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73

two were classified as anger and one as sad. Twelve out ofthe 15 angry expressions were classified correctly. Of the re-maining three, one was classified as sad and the other twoas fear. Four out of the eight faces of fear expression wereclassified correctly. Of the remaining four, two were clas-sified as sad, and the other two including the one with lowintensity rating were classified as anger. This follows the find-ings that humans recognize the expression of happy emotionbetter than the expression of other emotions (Kohler et al.,2004). However, the results regarding faces with expressionof sadness and fear are inconclusive, as the number of facesavailable is too few for the PCA analysis to be feasible forclassification.

4. Discussion, conclusions and future work

We have proposed an effective, flexible and objectivemethod of quantifying the intensity of emotions. The appli-cability of our approach has been demonstrated on variousexpressions at varying levels of intensity. We have also beenable to associate a pixel-wise value corresponding to an ex-pression change, based on the expansion/contraction of thatregion. The creation of this pixel-wise association makes itevident that the method can quantify even subtle differenceso ten-s as as nif-i erentc

c tions.A sfor-m facea ssionc m-a to thel st int thed

od’sp es. Int sivet addi-t geso reada be-h , asi lthyc , thent mapo theb quan-t fersf ingt

We have also shown the applicability of our method forthe gender-wise analysis of facial expression data (inFig. 11).We have been able to identify areas that have significant dif-ference in the deformation between men and women for theexpression of anger. The analysis has been limited by the sizeof the database and the facial images available. With a largernumber of samples, it will be possible to perform such ananalysis for the other expressions also. We propose to do thisin the future. In addition, we will be able to identify regionsthat are more deformed in one gender compared to the other.This is important for clinical investigations of disorders suchas schizophrenia and affective illness, where gender differ-ences are salient (Gur et al., 1992, 2004; Shtasel et al., 1992).Also the quantification of a person’s expression may indicatehow the person responds to medication or other factors. Inthe future, we propose to carry out quantitative and statisticalanalysis to identify differences between evoked and posed ex-pressions and the variability of expressions across ethnicitiesand gender. Also, we intend to study the differences in ex-pressibility of the left and right side of the face (Indersmittenand Gur, 2003; Sackeim et al., 1978). In addition, we willinvestigate the differences in classification with PCA whenfaces of women are used for analysis, as opposed to malefacial expressions.

Our technique is able to capture very subtle differencesi t thatP uishs mples ionso amples thee angera thei PCA,t , wep d forr ssi-fi n willb tiont dif-f s al-t s andm ug-m us ab rninga ogni-t lin-i

t ons r toc ionale pro-p thoda (au-t firstf

n a region-wise basis, for expressions at all levels of inity. This is important for any facial expression analysis,ingle number quantifying the whole face is of limited sigcance because various regions of the face undergo diffhanges in the same expression of emotion.

The average of the RVD maps (seeFig. 10) quantifies thehanges in each part of the face for each of the four emolso the deformation fields produced by the shape tranation can be averaged to show which regions of there deformed the most during the course of the exprehange (seeFig. 9). However due to the small number of iges available for each expression and gender and due

arge variability, some subtle expression changes are lohe average model. This will be improved by augmentingatabase with more images.

In this paper, we included a demonstration of the methotential to address issues related to gender differenc

he future, we propose to give this topic the comprehenreatment it deserves, by augmenting the database withional images and test the algorithm extensively for imaf both genders and various ethnicities. We see widesppplication of our approach in clinical investigation ofavioral disorders that impair emotion expression ability

n schizophrenia. We will generate RVD maps using heaontrols and produce average maps of the deformationhey would serve as a benchmark against which an RVDf any patient with a brain disorder can be analyzed. Onasis of the averages of healthy controls, we propose to

ify how much the expression of the impaired subject difrom the average, thereby providing a method of identifyhe degree of impairment.

n facial expression change. This was evident in the facCA analysis on the four expressions was able to distingome of the expressions correctly, even with small saizes and large inter-sample variability. For the expressf sadness and fear, the error rate was high as the size was too small and there was a large variability inxpression. As was seen in the case of comparison ofnd happiness, the results improved considerably with

ncrease in sample size. Based on the results of thehrough which we were able to classify some emotionsropose to extend our method further so that it can be useecognition/classification of expressions as well. For clacation, the measurements of expansion and contractioe fed to either a voxel-wise analysis or to a classifica

echnique with appropriate normalization for magnitudeerences. We are currently investigating this with variouernate learning approaches like support vector machineethods which require smaller datasets for training. Aenting the database with more images will also giveetter classification of a test expression. Using these leapproaches, we propose to develop a method of face rec

ion and quantification, which will be validated against ccal expression rating using FACS.

The analysis in this paper has been carried ouingle image captured at varying intensities. In ordeapture all the motion dependant subtleties of the emotxpression process (like eye blink or involuntary tics), weose to extend our technique into a fully automated mepplicable to video sequences of expression formation

omated after the regions have been identified in therame).

Page 13: Quantification of facial expressions using high-dimensional

R. Verma et al. / Journal of Neuroscience Methods 141 (2005) 61–73 73

In addition, there is a difference in the degree of defor-mation from the neutral to the peak, between individuals.We propose to analyze and compare the rate of expressionacross individuals by applying our approach to videos of theexpression change. The analysis of subsequent frames of thevideo sequence will help generate a temporal trajectory ofthe expression change. This will also be used in gender-basedanalysis.

The novel method proposed in the paper and the applica-tions demonstrate its potential of applicability to behavioralinvestigations. In the future we propose a multi-faceted ex-tension and application of our approach to larger datasetswith adequate representation from both genders and differ-ent ethnicities. We propose to extend our approach to applyto videos of expression change, in order to obtain a tempo-ral quantification. We will perform extensive statistical andcomputational analysis and validation of the RVD maps anddeformation fields on large datasets.

References

Bajcsy R, Kovacic S. Multiresolution elastic matching. Comput VisionGraphics Image Process 1989;46:1–21.

Bartlett MR, Hager JC, Ekman P, Sejnowski TJ. Measuring facial expres-sions by computer image analysis. Psychophysiology 1999;36:253–64.

B encesmput

C rain

C is byCS

C tric

D don:

D shape

D . Acal-

D ac-.

DE chol-

E ress;

E ogni-ntell

G ticalCog-

G cialchi-

G , etd itsMeth

Gur RE, Kohler C, Turetsky BI, Seigel SJ, Kanes SJ, Bilker WB. Asexually dimorphic ratio of orbitofrontal to amygdala volume is alteredin schizophrenia. Biol Psychiatry 2004;55:512–7.

Indersmitten T, Gur RC. Emotion processing in chimeric faces: hemi-spheric asymmetries in expression and recognition of emotions. JNeurosci 2003;23:3820–5.

Karpouzis K, Votsis G, Moschovitis G, Kollias S. Emotion recognition us-ing feature extraction and 3D models. Computational intelligence andapplications. World Scientific and Engineering Society Press; 1999.pp. 342–7.

Kohler CG, Turner T, Stolar NM, Bilker WB, Bresinger CM, Gur RE, etal. Differences in facial expression of four universal emotion. PsychiatRes 2004, in press.

Lien J, Kanade T, Cohn J, Li CC. Detection, tracking and classifi-cation of action units in facial expression. J Robot Autonom Syst2000;31:131–46.

Maintz JBA, Viergever MA. A survey of medical image registration. MedImage Anal 1998;2(1):1–36.

Martinez A. Recognizing imprecisely localized, partially occluded andexpression variant faces from a single sample per class. IEEE TransPattern Anal Mach Intell 2002;24(6):748–63.

Martinez A, Kak AC. PCA versus LDA. IEEE Trans Pattern Anal MachIntell 2001;23(2):228–33.

Palermo R, Coltheart M. Photographs of facial expression accuracy, re-sponse times and ratings of intensity. Behav Res Meth, Instrum Comp2004, in press.

Rinn WE. The neuropsychology of facial expression: a review of theneurological and psychological mechanisms for producing facial ex-pressions. Psychol Bull 1984;95:52–77.

Sackeim HA, Gur RC, Saucy MC. Emotions are expressed more intensely

S r JP,pos-urol

S per-rmal

S volu-An-

S dif-Res

S uilt. In:nce;

T andrence

T rsity

T res-

T res-115.

Y for

Y ageIntell

Y sing

Z alysisArtif

lack MJ, Yacoob Y. Recognizing facial expressions in image sequusing local parameterized models of image motion. Int J CoVision 1997;25(1):23–48.

hristensen GE, Joshi SC, Miller MI. Volumetric transformation of banatomy. IEEE Trans Med Imaging 1997;16(6):864–77.

ohn JF, Zlochower AJ, Lien J, Kanade T. Automated face analysfeature point tracking has high concurrent validity with manual FAcoding. Psychophysiology 1999;36:35–43.

ollins DL, Evans AC. Automatic 3D estimation of gross morphomevariability in human brain. Neuroimage 1996;3(3).

arwin C. The expression of the emotions in man and animals. LonJohn Murray; 1872.

avatzikos C. Measuring biological shape using geometry-basedtransformations. Image Vision Comput 2001;19:63–74.

avatzikos C, Vaillant S, Resnick JL, Prince S, Levotsky N, Bryancomputerized approach for morphological analysis of the corpuslosum. J Comput Assisted Tomography 1996;20:88–97.

onato G, Bartlett MR, Ekman P, Sejnowski TJ. Classifying facialtions. IEEE Trans Pattern Anal Mach Intell 1999;21(10):974–89

uda RO, Hart PE, Stork DG. Pattern classification. Wiley; 2001.kman P, Friesen WV. Facial action coding system. Consulting Psy

ogists Press; 1978.kman P, Rosenberg E. What the face reveals. Oxford University P

1997.ssa IA, Pentland AP. Coding, analysis, interpretation and rec

tion of facial expressions. IEEE Trans Pattern Anal Mach I1997;19(7):757–63.

ainnoti G, Caltagirone C, Zoccolotti P. Left/right and cortical/subcordichotomies in the neuropsychological study of human emotions.nition Emotion 1993;7:71–94.

ur RC, Erwin RJ, Gur RE, Zwil AS, Heimberg C, Kraemer HC. Faemotion discrimination. II. Behavioral findings in depression. Psyatry Res 1992;42:241–51.

ur RC, Sara R, Agerdoorn MA, Marom O, Hughett P, Macy Lal. A method for obtaining 3-dimensional facial expressions anstandardization for the use in neurocognitive studies. J Neurosci2002;115:137–43.

on the left side of the face. Science 1978;202:434–6.akheim HA, Greenberg MS, Weiman AL, Gur RC, Hungerbuhle

Geschwind N. Hemispheric asymmetry in the expression ofitive and negative emotions: neurological evidence. Arch Ne1982;39:210–8.

alem JE, Kring AM, Kerr SL. More evidence for generalized poorformance in facial emotion perception in schizophrenia. J AbnoPsychol 1996;105:480–3.

chmidt KL, Cohn JF. Human facial expressions as adaptations: Etionary questions in facial expression research. Yearbook Physthropol 2001;44:3–24.

htasel DL, Gur RE, Gallacher F, Heimberg C, Gur RC. Genderference in the clinical expression of schizophrenia. Schizophr1992;7:225–32.

ubsol G, Thirion JP, Ayache N. Application of an automatically b3D morphometric brain atlas: study of cerebral ventricle shapeVision in biomedical computing. Lecture notes in computer scie1996 pp. 373–82.

erzopoulos D, Waters K. Analysis of facial images using physicalanatomical models. In: Proceedings of the International Confeof Computer Vision, 1990.

hompson DW. On growth and form. Cambridge: Cambridge UnivePress; 1917.

ian Y-L, Kanade T, Cohn J. Recognizing action units for facial expsion analysis. CMU Technical Report CMU-RI-TR-99-40, 1999.

ian Y-L, Kanade T, Cohn J. Recognizing action units for facial expsion analysis. IEEE Trans Pattern Anal Mach Intell 2001;23:97–

acoob Y, Davis L. Recognizing human facial expression. CentreAutomation Research, University of Maryland; 1994.

acoob Y, Davis L. Recognizing human facial expression from long imsequences using optical flow. IEEE Trans Pattern Anal Mach1996;18(6):636–42.

uille A, Haallinaan P, Cohen DS. Feature extraction from faces udeformable templates. Int J Comput Vision 1992;8(2):99–111.

hang Z. Feature-based facial expression recognition: Sensitivity anand experiments with a multilayer perception. Int J Pattern RecogIntell 1999;13(6):898–911.