facial expression analysis - aalborg universitet · facial expression analysis jeff cohn ......

1Facial ExpressionAnalysis F. De la Torre/J. Cohn Looking @ People (CVPR-12)

Facial Expression Analysis

Jeff Cohn

Fernando De la Torre

Tutorial Looking @ People

June 2012

Human Sensing

Laboratory

Outline

• Introduction

• Facial Action Coding System (FACS)

– Discrete vs. dimensional approaches

• Applications of FEA

• Databases

• Algorithms

– Supervised

– Unsupervised

• Conclusions and open problems

Outline

• Introduction

– Discrete vs. dimensional approaches

• Databases

• Algorithms

– Supervised

– Unsupervised

• Conclusions and open problems

Supervised Facial Expression Analysis (FEA)

• Most work on FEA has been supervised using

different registration, features and classifiers.

Supervised FEA (II)

2D/3D Face

tracking (AAM)

Classifiers (identity,

discriminate Aus)

Features (illumination,

identity)

Registration (remove 3D

rigid motion)

AU present?

• Generative (Parameterized Appearance Models)– Active Appearance models (e.g., Cootes et al. 98, Romdhani et al. 99, De la Torre 00,

Matthews & Baker 05, De la Torre & Nguyen 08, Gong et al. 00)

– Eigentracking (e.g., Black & Jepson 98)

– Morphable models (e.g., Jones & Poggio 98, Blanz & Better 99)

• Discriminative– Regression:

• Classifier fitting (e.g., Liu 09)

• Continuous regression (e.g., Sauer et al. 11, Saragih 11)

• Cascaded regression (e.g., Dollar et al. 10, Cao et al. 12)

– Local models:• Constrained Local model (e.g., Cristanace & Cootes 08, Lucey et al. 09, Saragih et al. 10)

• Part-based model (Zhu & Ramanan 12)

Facial feature detection

Procrustes

Shape modes

Shape normalised images

Hand-Labeled

Training Data

Appearance modes

B1 B2B0

Parameterized Appearance Models

Detection as an Optimization Problem

)),0 a(f(xs...a))(f(x,s n++

Learned off-line

•Translation

rotation, scale

•Non-rigid

parameters

0.7 0.2

)( Bcd −

B1 B2B0

•Appearance

parameters

Problems• Prone to local minima

• Not generalize well (e.g., different people)

(Nguyen & De la Torre 10)

Discriminative models

0)∆aa,)∆cS(cf(

0 ++ Sfeatures

parameters

Non-rigid

parameters

1 ca ∆∆

2 ca ∆∆ )( 2

02 )∆aa,)∆cS(cf(

0 ++ sfeatures

• In general improve generalization(e.g., Liu 09, Sauer et al. 11, Saragih 11, Dollar et al. 2010, Cao et al. 2012)

Discriminative models (II)• Local discriminative models

• Constrained Local model (e.g., Cristanace & Cootes 08, Lucey et al. 09, Saragih

et al. 10)

• Part-based model (Zhu & Ramanan 2012)

Thanks Saragih/Lucey

Face registration• What are the three most important aspects of face recognition?

• Similarity registration (e.g., Barlett et al. 05, Whitehill et al. 11)

• Piece-wise warping (e.g., Cootes et al. 98, Gong et al. 00, Tong et al. 07, De la Torre &

Nyugen 08, Jones & Poggio, 1998, Lucey et al. 09, Saragih et al. 10)

Rotate,

Piece-wise

warping

Benefits:

- Subtle AUs

- Out-of plane

rotation (3D

models)

“registration, registration, registration” (Takeo Kanade ‘90)

• 3D registration (Thanks Laszlo Jeni)

Face Registration

Supervised FEA (II)

Face tracking

Classifiers Features

Registration

AU present?

Features• Three types: (1) Shape, (2) Appearance, (3)

Temporal features.

(e.g., Sebe et al. 07, Asthana et al. 09,

Lucey et al., 2007; Chew et al., 2011, Zhou

et al. 10; Valstar et al. 12)

(e.g., Zhou et al. 2010)

• Shape features

Appearance featuresRaw pixels

Gabor bank

Box filtersSIFT/HOG

Local binary patterns

(e.g., Kanade et al., 2000)

(e.g., Donato et al., 99;

Barlett 04, Littlewort et al.,

2006, Whitehill et al. 11)(e.g., Shan et al 09, Zhao et

al., 10 Jiang et al. 11)

(e.g., Whitehill & Omlin, 2006)(e.g., Zhu et al. 2011,

Simon et al., 2010, Dhall et al. 11)

(e.g., Zhi et al. 11,

Zafeiriou and Petrou 10)

• Warning!!: Appearance features typically need

dimensionality reduction and/or feature selection

Temporal features

Motion units/trajectories

(e.g., Cohen et al., 02, Li et al. 01)

Optical flow

(e.g., Essa and Pentland 97, Gunes and Piccardi 05)

Motion history

(e.g., Valstar et al., 04,

Koelstra et al. 10)

Bag of temporal words

(e.g., Simon et al., 10)

Supervised FEA (II)

Face tracking

Registration

AU present?

Classifiers• Static

– Exemplar + GMM (Wen and Huang, 2003)

– Neural Network (Kapoor and Picard, 2005)

– SVM/Adaboost (Bartlett et al., 2005)

– Linear Discriminant Classifiers (Wang et al., 2006)

– Gaussian Process (Chen et al., 2009)

– Boosting (Shan et al. 2006 , Zhu et al. 2010)

• Dynamic

– Hidden Markov models (Lien et al, 2000)

– Dynamic Bayesian Network (Tong et al., 2007)

– Conditional random field (Chang and Liu, 2009)

– Temporal Bag of Words (Simon et al. 2010)

The million $ question

• Which is the best feature and classifier?

• Data

– Have access to reliable and well annotated data

– The more data the better

• Features

– It is AU dependent

– In general feature fusion is the best (e.g., multiple kernel

learning)

• Classifier

– Depends on the amount of training data

– How familiar you are with the classifier

Supervised FEA (II)

Face tracking

Registration

AU present?

Sample selection

… … … …

Onset OffsetPeak

Intensity

+ --• Make good use of the data!!! (Zhu et al 11, Simon 10)

The first number between lines | denotes the area under the ROC, the second number is the size of positive samples in the testing dataset and separated by / is the size of negative samples in the testing dataset. The third number denotes the size of positive samples in training working sets and separated by / the total frames of target AU in training data sets.

Results for AU4 and AU12

Bayesian networks• Bayesian networks to model spatial and temporal

relationships among different Aus (Tong et al. 05, Shang et al 07).

Outline• Introduction

• Databases

• Algorithms

– Supervised

– Unsupervised

• Conclusions and future work

Motivation

• Mining facial expression for one subject

Motivation

• Summarization

• Visualization

• Indexing

Looking up Sleeping SmilingLooking

forwardWaking up

Motivation

• Summarization

• Visualization

• Indexing

• Mining facial expression of one subject

Motivation

• Summarization

• Embedding

• Indexing

• Summarization

• Embedding

• Indexing

Mining facial expression across subjects

RU-FACS database (Bartlett et al. ’06)

Aligned Cluster Analysis

3hStart and end of the segments (h)

mh 1+mh1h 2hLabels (G)

(Zhou et a. ‘10)

Kernel k-means and spectral clustering(Ding et al. ‘02, Dhillon et al. ‘04, Zass and Shashua ‘05, De la Torre ‘06)

2||||),( FJ MGXGM −=

)))((()( 1

n GGGGIKG−−= TTtrJ

)()( XXK ϕϕ T=

2)(),,(

FacaJ MGXGM −= ϕ

Problem formulation for ACA

H )..[)..[)..[ 13221,...,,

+mm hhhhhh XXX

)..[ 21 hhX )..[ 32 hhX )..[ 1+mm hhX

Labels (G)3h

Start and end of the segments (h)mh 1+mh

Dynamic Time Alignment Kernel (Shimodaira et al. 01)

1h 2h 4h

2)(),,(

FkkmJ MGXGM −= ϕ

Matrix formulation for ACA

GGGGILKL1

n )(with)( −−== TT

kmk trJ

samples

}{ 2371,0

×∈H

GHGGGHILWLK1

n )(with))o(( −−== TTT

aca trJ

2323×∈RW

segments

}{ 731,0

×∈G

)()( XXK ϕϕ T=

Dynamic Time

Alignment Kernel

(Shimodaira et al. 01)

23 frames, 3 clusters

Facial image features

Appearance

• Active Appearance Models (Baker and Matthews ‘04)

Shape• Image features

• Cohn-Kanade: 30 people and five different expressions

(surprise, joy, sadness, fear, anger)

Facial event discovery across subjects

ACA Spectral

Clustering (SC)

0.87(.05) 0.56(.04)

• 10 sets of 30 people

Unsupervised facial event discovery

FACS coding

Nose Wrinkler (AU9)Upper lid raiser (AU5) Lip Tightener (AU23)Outer Brow Raiser (AU2)

ACA Spectral Clustering

Lower face 0.53(.09) 0.39(0.14)

Upper face 0.69(.12) 0.47(0.12)

Conclusions and open problems

• Supervised and unsupervised algorithms for FEA

• Tracking/registration

– Registration, registration, registration… changes in pose (3D models)

– Robustness to occlusion

• Features

– Subtle facial expressions

– Dynamics (e.g., temporal envelope)

• Classifiers

– Expression intensity

– Individual differences

– Predicting onset/offset

– Truly multi-class AU detection

Conclusions and open problems• Data

– Attention to reliability of ground truth

– Shared, well-annotated video

– Innovative ways to use video that cannot be shared

• Segmentation and timing

– Intra-personal

– Interpersonal

• User-in-loop approaches

– User-assisted coding, e.g., Fast FACS (e.g., Simon et al., 2011)

– Combining manual and automated measurement (e.g. Ambadar

et al. 2009)

– Person-dependent classifiers

• Other issues

– Multimodal

– Context

43Facial ExpressionAnalysis F. De la Torre/J. Cohn Looking @ People (CVPR-12)

Questions?

Jeff Cohn

Fernando De la Torre

Tutorial Looking @ People

June 2012

Human Sensing

Laboratory

facial expression analysis - aalborg universitet · facial expression analysis jeff cohn ......

Documents

facial expression

facial expressions & rigging - home | computer … ·...

human’s facial parts extraction to recognize facial...

facial expression analysis based on high dimensional...

facial expression recognition

a 3d facial expression database for facial behavior...

04muscles facial expression prnt

facial expression research

quo vadis face recognition? jeffrey f. cohn robotics...

facial expression recognition with spatiotemporal local...

facial expression recognition using facial landmark

muscles of facial expression

facial expression handbook_of_cognition_and_emotion

facial expression recognition based on a mlp neural network...

42i9-facial expression classification

observer-based measurement of facial expression with the...

facial expression recognition algorithm based on knn...

web-based database for facial expression analysis · 2....

affectnet: a database for facial expression, …...early...

simultaneous facial feature tracking and facial expression...