mehran kafai, bir bhanu, le analumni.cs.ucr.edu/~mkafai/papers/poster_icpr2012.pdf · mehran kafai,...
TRANSCRIPT
Mehran Kafai, Bir Bhanu, Le An Center for Research in Intelligent Systems, University of California, Riverside, USA
1. Introduction Goal:
Robustly estimate the pose of a given face by
classifying the face to a predetermined set of
poses.
Sample images for pose estimation
Challenge: Limited discriminative power of commonly
used classifiers such as LDA, SVM, or NN
results in low classification accuracy.
Solution: • We present the Cluster-Classification
Bayesian Network (CCBN), a graphical
model specifically designed for
classification after clustering.
• A pose layout is defined where similar
poses are assigned to the same group.
• The discriminative power increases within
the same group when similar yet different
poses are present.
• The CCBN is trained on multi-pose face
image databases.
• 2200 images from FEI database, 200 individuals, 11 poses from
profile left to profile right.
• 4200 images from CAS-PEAL database, 200 individuals, 21 poses
from 9 cameras spaced in a horizontal semicircular shelf.
2. Technical Approach
Sample images from FEI database with pose layout overlay
Sample images from CAS-PEAL database with pose layout overlay
Accuracy comparison on CAS-PEAL
3. Results
Accuracy comparison on FEI
• Define pose layout such that similar poses are
located in neighboring locations (1-11 are pose
IDs from FEI database).
• Layout can be one, two, or three dimensional.
• Layout is partitioned into groups. Each group
holds similar poses. Each pose belongs to at least
one group.
• Partitioning may be performed using systematic
or heuristic methods.
Sample pose layout
• We introduced a novel pose estimation method using the
Cluster-Classification Bayesian Network (CCBN).
• By clustering similar poses into the same block, the trained
classifier is more discriminative in these similar poses.
• Experimental results show that the CCBN has superior
performance compared to the NN, LDA, and SVM classifiers.
• CCBN with HOG as the feature descriptor achieves the
highest performance.
Goal is to compute max 𝑐𝑐𝑐𝑐𝑐𝑃(𝐶|𝐹)
Definition of CCBN nodes: • C : Class node
• Holds probability distribution over all poses • Discrete node • Size equal to number of poses
• F : Feature node • Corresponds to feature vector representing the data • Discrete or continuous based on data • Size equal to dimensionality of data
• 𝑩𝟏, … ,𝑩𝒎 ∶ group nodes • 𝐵𝑖 determines membership probability of data to group 𝑖 vs. all other groups.
and
where
Probability of a given data 𝑓 being from class 𝑐𝑘 is formulated as:
which represents the pose with the highest probability. Joint probability distribution with class node 𝐶, feature node 𝐹, and group nodes 𝐵1,𝐵2, … ,𝐵𝑚 is defined as:
• Each group is represented by a node B in the middle layer of the
corresponding CCBN.
4. Conclusions
• CCBN has greater accuracy than the other three
classifiers for 9 out of the total 11 poses on FEI database.
• The average accuracy for CCBN is 3.48% more than
SVM, 5.81% more than LDA, and 11.21% more than NN.
• Images resized to 32x32, each image represented by 240-
dimensional HOG feature vector.
• 10-fold cross validation, 150 individuals are used for training
and 50 individuals for testing.
ROC plot for performance on FEI