pca and face recognition - tamara · pdf fileface recognition based on pca models •face...
TRANSCRIPT
PCA and Face Recognition
Dinghuang Ji
What is PCA?
Color Age Hair Eye Mouth Nose w/o
glasses w/o
earings ………
1. What features are the most important (but not semantic) to identify different group of people? 2. Can we combine these features to reduce this list?
Toy example
• 1. generate some data sample
x = 1:100; y = 20 + 3 * x + 60*randn(100,1)'; scatter(x,y,25,'b','*')
Example courtesy of Marc
Toy example
• 2. find a line fit f(x)
P = polyfit(x,y,1); yfit = P(1)*x+P(2); hold on; plot(x,yfit,'r-.');
Toy example
• 3. find a line fit g(y)
P1 = polyfit(y,x,1); xfit = P1(1)*y+P1(2); plot(xfit,y,'b-.');
Toy example
• 4. find a line fit with the first principal component
x_u = x - mean(x); y_u = y - mean(y); cov_xy = cov(x_u ,y_u); [eigenVec,eigenVal] = eig(cov_xy); plot(x,eigenVec(2,2)/eigenVec(2,1)*x_u+mean(y),'g-.');
Principle Component Analysis
• Principal component analysis (PCA) is a technique that is useful for the compression and classification of data. The purpose is to reduce the dimensionality of a data set (sample) by finding a new set of variables, smaller than the original set of variables, that nonetheless retains most of the sample's information.
• By information we mean the variation present in the sample, given by the correlations between the original variables. The new variables, called principal components (PCs), are uncorrelated, and are ordered by the fraction of the total information each retains.
Slides courtesy of Frank Masci
Principle Component Analysis
principal component
Slides courtesy of Deng Cai
Principle Component Analysis
• Multi-dimension
• The 1st PC z1 is a minimum distance fit to a line in X space
• The 2nd PC z2 is a minimum distance fit to a line in the plane perpendicular to the 1st PC, and have the largest variance perpendicular to 1st PC .
• PCs are a series of linear least squares fits to a line, each orthogonal to all the previous, and have the largest variance perpendicular to all previous PCs.
Principle Component Analysis
• Main steps for computing PCA:
• Form the covariance matrix S.
• Compute its eigenvectors: 𝒂𝑖 𝑖=1𝑝
• Use the first d eigenvectors 𝒂𝑖 𝑖=1𝑑 to form the d PCs.
• The transformation A is given by 𝐴 = 𝒂1, ⋯𝒂𝑑
• Dimension reduction: 𝑋 ∈ ℛ𝑝×𝑛 → 𝐴𝑇𝑋 ∈ ℛ𝑑×𝑛
• Original data: 𝐴𝑇𝑋 ∈ ℛ𝑑×𝑛 → 𝑋 = 𝐴 𝐴𝑇𝑋 ∈ ℛ𝑝×𝑛
Slides courtesy of Deng Cai
Face recognition based on PCA models
• Face Recognition using Eigenfaces
• Facial Recognition Using Active Shape Models, Local Patches and Support Vector Machines
• Face Recognition Based on Fitting a 3D Morphable Model
EigenFace
EigenFace
• The test image x is projected into the face space to obtain a vector p:
p = AT(x – m)
• The distance of p to each face class is defined by
Єk2 = ||p-pk||2; k = 1,…,m
• A distance threshold Өc, is half the largest distance between any two face images:
Өc = ½ maxj,k {||pj-pk||}; j,k = 1,…,m
Slides courtesy of Peter N. Belhumeur
EigenFace
• Find the distance Є between the original image x and its reconstructed image from the eigenface space, xf,
Є2 = || x – xf ||2 , where xf = U * x + m
• Recognition process:
• IF Є≥Өc
then input image is not a face image;
• IF Є<Өc AND Єk≥Өc for all k then input image contains an unknown face;
• IF Є<Өc AND Єk*=mink{ Єk} < Өc then input image contains the face of individual k*
Eigenface
• Limitations
• Variations in lighting conditions • Different lighting conditions for
enrolment and query.
• Bright light causing image saturation.
• Differences in pose – Head orientation
- 2D feature distances appear to distort.
• Expression
- Change in feature location and shape.
Active shape model
Active shape model
• Proposed by Cootes et. Al based on points distribution model • For facial images, we have
• Landmarks are manually labelled and aligned with Procrustes algorithm
• PCA analysis and obtain
• is called shape parameter and is used to change the facial shape
• Procrustes algorithm
• Find a rigid transformation between two shapes
• Could be computed by least square
Active shape model
Boundary finding with mahalanobis distance
Shape model
Profile pixel modeling
Active shape model
• Algorithm steps: 1. Fit a mean model
2. Find accurate landmark positions
3. Optimize to get a better fit
4. Repeat until convergence
Face recognition with ASM,LP and SVM
• Obtain a set of landmark correspondences.
• Compute local patch feature around the landmarks • 348 dim Gabor wavelet
• Maybe LBP, Geometric blur etc.
• Train one versus all svm model
Experiments
• Do PCA on features
Face recognition with ASM,LP and SVM
• Pro: • More robust to in-plane rotation and illumination
• Con: • Can’t handle profile view faces and wide range of illuminations
Face recognition with 3D Morphable Model
Manually label 7 landmarks of Test image
Fit 3D model to the 2D landmarks
1 2
3
Project 3D model to 2D image and iteratively optimizing model coefficients
4 Minimize
How Do They Do It?
By exploiting the statistics of known faces.
The morphable model is built from 3D scan of 100 males and 100 females with different ages. The structure of newly generated faces is constrained to be in the range of that of known faces. Slides courtesy of Volker Blanz
The Morphable 3D Face Model
The actual 3D structure of known faces is captured in the shape vector S = (x1, y1, z1, x2, …, yn, zn)T, containing the (x, y, z) coordinates of the n vertices of a face, and the texture vector T = (R1, G1, B1, R2, …, Gn, Bn)T, containing the color values at the corresponding vertices.
Slides courtesy of Volker Blanz
The Morphable 3D face model
Again, assuming that we have m such vector pairs in full correspondence, we can form new shapes Smodel and new textures Tmodel as:
m
i
iimodel a1
SS
m
iiimodel
1
TT β
The eigenvalues si2 of CS represent the variance of the
data set along the direction si, the corresponding eigenvector of CS. So Smodel can now be expressed as:
The Morphable 3D Face Model
m
i
iiavmodel
1
sSS
and the probability density fit over our data set
is a function of = (1, 2, ... , m)T:
))(2
1exp()(
1
2
m
i i
ips
α
Optimization
• They employ a maximum a posteriori estimator
min
Experiments
• Can handle harsh illumination,
nonfrontal view or glasses
Experiments
3D Morphable model
• Demo
• Facegen
Thank you
• Questions are welcome
Recognition using Compressed Sensing
Sparse signals
Slide credit: Duarte, Marco F., et al. "Single-pixel imaging via compressive sampling." Signal Processing Magazine, IEEE 25.2 (2008): 83-91.
Selection of features is immaterial as long as the feature space is sparse
Eigenfaces Fisherfaces Laplacianfaces
Occluded images
Patches of image as features
Slide credit: Wright, John, et al. "Robust face recognition via sparse representation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.2 (2009): 210-227.
Sparse feature space and formulation of the recognition problem
Ideal solution (NP hard):
Compressed sensing solution:
Slide credit: Wright, John, et al. "Robust face recognition via sparse representation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.2 (2009): 210-227.
L-1 and l-0 minimization routines
• L-1 norm:
– Matching pursuits
– Basis pursuit
– Quadratic solvers
• L-0 norm:
– Smoothened L0 algorithm (SL0)
Valid image vs invalid image
Slide credit: Wright, John, et al. "Robust face recognition via sparse representation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.2 (2009): 210-227.
Results
Slide credit: Wright, John, et al. "Robust face recognition via sparse representation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.2 (2009): 210-227.
Robust to noise and occlusion
Slide credit: Wright, John, et al. "Robust face recognition via sparse representation." Pattern Analysis and Machine Intelligence, IEEE Transactions on 31.2 (2009): 210-227.
Demo: Raw dataset • MSI data for 1 user in 1 session:
850nm 940nm white
460nm 630nm 700nm
Demo: ROI extraction
Raw image Find angle and largest rectangle
Crop out the largest rectangle
Demo: Features for recognition algorithm: image patches
Demo: Successful recognition heat maps
• Less number of users:
• Large number of users:
Note that the signal is sparse
Demo: Unsuccessful recognition heat maps
• Less number of users:
• Large number of users:
Note that the signal is NOT sparse.
Resources
• http://dsp.rice.edu/cs
Recognizing Actions in Movies
KTH Actions Dataset
Movie Dataset
Space-time Interest Points
• Describe a video segment instead of a single image
• Detected for multiple space-time scales
• Corners in space-time
Optical Flow
• Direction of movement of each pixel
Space-time Features
• Normalized histograms are concatenated into descriptor vectors
• K-means clustering on training data features to form visual vocabulary
Video Sequence Classification
• Space-time pyramid
• Histogram of visual words occurrences over a space-time volume
• Histograms of subsequences of video are concatenated and normalized
• Non-linear SVM using a Gaussian kernel
Results
Using Grammars for Action Recognition
Aniket Bera
Video analysis with CFGs
The “Inverse Hollywood problem”:
From video to scripts and storyboards via causal analysis.
Brand 1997
Action Recognition using Probabilistic Parsing.Bobick and Ivanov 1998
Recognizing Multitasked Activities from Video using
Stochastic Context-Free Grammar.
Moore and Essa 2001
13
CFG for human activities
enter detach leave enter detach attach touch touch detach attach leave
M. Brand. The "Inverse Hollywood Problem":From video to scripts and storyboards
via causal analysis. AAAI 1997.
14
Parse treeSCENE (Open up a PC)
IN ACTION (Open PC)
OUT IN
ADD ADD
enter detach leave enter
ACTION (unscrew) OUT
MOVE REMOVE
MOTION MOTION
detach attach touch touch detach attach leave
• Deterministic low-level primitive detection• Deterministic parsing
M. Brand. The "Inverse Hollywood Problem": From video to scripts and storyboards via causal analysis. AAAI 1997.
15
Stochastic CFGs
Action Recognition using Probabilistic Parsing.Bobick and Ivanov 1998
16
Gesture analysis with CFGs
Primitive recognition with HMMs
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 17
left-right
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 18
up-down
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 19
right-left
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 20
down-up
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 21
Parse Tree
S
RH
TOP UD BOT DU
LR RL
left-right up-down right-left down-up
22
Errors
Likelihood value over time (not discrete symbols)
HMM a
HMM b
Errors are inevitable…
but the grammar acts as a top-down constraint
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 23
Dealing with uncertainty & errors
Stolcke-Early (probabilistic) parser
SKIP rules to deal with insertion errors
HMM a
HMM b
HMM c
Action Recognition using Probabilistic Parsing. Bobick and Ivanov 1998 24
SCFG for Blackjack
Recognizing Multitasked Activities from Video usingStochastic Context-Free Grammar.
Moore and Essa 2001
• Deals with more complex activities• Deals with more error types
25
Stochastic Grammars: Overview
• Representation: Stochastic grammar• Terminals: object interactions• Context-sensitive due to internal scene models
• Domain: Towers of Hanoi• Requires activities with
strong temporal constraints
• Contributions• Showed recognition &
decomposition with veryweak appearance models
• Demonstrated usefulnessof feedback from high tolow-level reasoning components
Expectation Grammars(CVPR 2003)
• Analyze video of a person physically solving the Towers of Hanoi task
• Recognize valid activity
• Identify each move
• Segment objects
• Detect distracters / noise
System Overview
ToH: Low-Level Vision
Raw VideoBackground
Model
ForegroundComponents
Foreground andshadow detection
Low-Level Features• Explanation-based symbols
• Blob interaction events
• merge, split, enter, exit, tracked, noise
• Future Work: hidden, revealed, blob-part, coalesce
• All possible explanations generated• Inconsistent explanations heuristically pruned
Enter
Merge
Contributions
• Showed activity recognition and decomposition without appearance models
• Demonstrated usefulness of feedback from high-level, long-term interpretations to low-level, short-term decisions