principal components analysis on images and face recognition most slides by s. narasimhan

60
Principal Components Analysis on Images and Face Recognition

Upload: stanley-chapman

Post on 19-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Principal Components Analysis

on Images and Face Recognition

Most Slides by S. Narasimhan

Page 2: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Data Presentation

• Example: 53 Blood and urine measurements (wet chemistry) from 65 people (33 alcoholics, 32 non-alcoholics).

• Matrix Format

• Spectral Format

H - W B C H - R B C H - H g b H - H c t H - M C V H - M C H H - M C H CH - M C H C

A 1 8 . 0 0 0 0 4 . 8 2 0 0 1 4 . 1 0 0 0 4 1 . 0 0 0 0 8 5 . 0 0 0 0 2 9 . 0 0 0 0 3 4 . 0 0 0 0

A 2 7 . 3 0 0 0 5 . 0 2 0 0 1 4 . 7 0 0 0 4 3 . 0 0 0 0 8 6 . 0 0 0 0 2 9 . 0 0 0 0 3 4 . 0 0 0 0

A 3 4 . 3 0 0 0 4 . 4 8 0 0 1 4 . 1 0 0 0 4 1 . 0 0 0 0 9 1 . 0 0 0 0 3 2 . 0 0 0 0 3 5 . 0 0 0 0

A 4 7 . 5 0 0 0 4 . 4 7 0 0 1 4 . 9 0 0 0 4 5 . 0 0 0 0 1 0 1 . 0 0 0 0 3 3 . 0 0 0 0 3 3 . 0 0 0 0

A 5 7 . 3 0 0 0 5 . 5 2 0 0 1 5 . 4 0 0 0 4 6 . 0 0 0 0 8 4 . 0 0 0 0 2 8 . 0 0 0 0 3 3 . 0 0 0 0

A 6 6 . 9 0 0 0 4 . 8 6 0 0 1 6 . 0 0 0 0 4 7 . 0 0 0 0 9 7 . 0 0 0 0 3 3 . 0 0 0 0 3 4 . 0 0 0 0

A 7 7 . 8 0 0 0 4 . 6 8 0 0 1 4 . 7 0 0 0 4 3 . 0 0 0 0 9 2 . 0 0 0 0 3 1 . 0 0 0 0 3 4 . 0 0 0 0

A 8 8 . 6 0 0 0 4 . 8 2 0 0 1 5 . 8 0 0 0 4 2 . 0 0 0 0 8 8 . 0 0 0 0 3 3 . 0 0 0 0 3 7 . 0 0 0 0

A 9 5 . 1 0 0 0 4 . 7 1 0 0 1 4 . 0 0 0 0 4 3 . 0 0 0 0 9 2 . 0 0 0 0 3 0 . 0 0 0 0 3 2 . 0 0 0 0

0 10 20 30 40 50 600100200300400500600700800900

1000

measurementV

alue

Measurement

Page 3: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

0 10 20 30 40 50 60 700

0.20.40.60.811.21.41.61.8

Person

H-B

ands

0 50 150 250 350 45050100150200250300350400450500550

C-Triglycerides

C-L

DH

0100

200300

400500

0200

4006000

1

2

3

4

C-TriglyceridesC-LDH

M-E

PI

Univariate Bivariate

Trivariate

Data Presentation

Page 4: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

• Better presentation than ordinate axes?• Do we need a 53 dimension space to view data?• How to find the ‘best’ low dimension space that

conveys maximum useful information?• One answer: Find “Principal Components”

Data Presentation

Page 5: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Principal Components

• All principal components (PCs) start at the origin of the ordinate axes.

• First PC is direction of maximum variance from origin

• Subsequent PCs are orthogonal to 1st PC and describe maximum residual variance

0 5 10 15 20 25 300

5

10

15

20

25

30

Wavelength 1

Wa

vele

ng

th 2

0 5 10 15 20 25 300

5

10

15

20

25

30

Wavelength 1

Wa

vele

ng

th 2

PC 1

PC 2

Page 6: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

The Goal

We wish to explain/summarize the underlying variance-covariance structure of a large set of variables through a few linear combinations of these variables.

Page 7: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Applications

• Uses:– Data Visualization– Data Reduction– Data Classification– Trend Analysis– Factor Analysis– Noise Reduction

• Examples:– How many unique “sub-sets” are in the

sample?– How are they similar / different?– What are the underlying factors that

influence the samples?– Which time / temporal trends are

(anti)correlated?– Which measurements are needed to

differentiate?– How to best present what is “interesting”?– Which “sub-set” does this new sample

rightfully belong?

Page 8: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

This is accomplished by rotating the axes.

Suppose we have a population measured on p random variables X1,…,Xp. Note that these random variables represent the p-axes of the Cartesian coordinate system in which the population resides. Our goal is to develop a new set of p axes (linear combinations of the original p axes) in the directions of greatest variability:

X1

X2

Trick: Rotate Coordinate Axes

Page 9: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Algebraic Interpretation

• Given m points in a n dimensional space, for large n, how does one project on to a low dimensional space while preserving broad trends in the data and allowing it to be visualized?

Page 10: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Algebraic Interpretation – 1D

• Given m points in a n dimensional space, for large n, how does one project on to a 1 dimensional space?

• Choose a line that fits the data so the points are spread out well along the line

Page 11: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

• Formally, minimize sum of squares of distances to the line.

• Why sum of squares? Because it allows fast minimization, assuming the line passes through 0

Algebraic Interpretation – 1D

Page 12: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

• Minimizing sum of squares of distances to the line is the same as maximizing the sum of squares of the projections on that line, thanks to Pythagoras.

Algebraic Interpretation – 1D

Page 13: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

From k original variables: x1,x2,...,xk:

Produce k new variables: y1,y2,...,yk:

y1 = a11x1 + a12x2 + ... + a1kxk

y2 = a21x1 + a22x2 + ... + a2kxk

...

yk = ak1x1 + ak2x2 + ... + akkxk

such that:

yk's are uncorrelated (orthogonal)y1 explains as much as possible of original variance in data sety2 explains as much as possible of remaining varianceetc.

PCA: General

Page 14: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

4.0 4.5 5.0 5.5 6.02

3

4

5

1st Principal Component, y1

2nd Principal Component, y2

Page 15: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Scores

4.0 4.5 5.0 5.5 6.02

3

4

5

xi2

xi1

yi,1 yi,2

Page 16: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Eigenvalues

4.0 4.5 5.0 5.5 6.02

3

4

5

λ1λ2

Page 17: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

From k original variables: x1,x2,...,xk:

Produce k new variables: y1,y2,...,yk:

y1 = a11x1 + a12x2 + ... + a1kxk

y2 = a21x1 + a22x2 + ... + a2kxk

...

yk = ak1x1 + ak2x2 + ... + akkxk

such that:

yk's are uncorrelated (orthogonal)y1 explains as much as possible of original variance in data sety2 explains as much as possible of remaining varianceetc.

yk's arePrincipal Components

PCA: Another Explanation

Page 18: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Principal Components Analysis on:

• Covariance Matrix:– Variables must be in same units– Emphasizes variables with most variance– Mean eigenvalue ≠ 1.0

• Correlation Matrix:– Variables are standardized (mean 0.0, SD 1.0)– Variables can be in different units– All variables have same impact on analysis– Mean eigenvalue = 1.0

Page 19: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

{a11,a12,...,a1k} is 1st Eigenvector of correlation /covariance matrix, and coefficients of first principal component

{a21,a22,...,a2k} is 2nd Eigenvector of correlation/covariance matrix, and coefficients of 2nd principal component

{ak1,ak2,...,akk} is kth Eigenvector of correlation/covariance matrix, and coefficients of kth principal component

PCA: General

Page 20: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Dimensionality Reduction

• Dimensionality reduction– We can represent the orange points with only their v1 coordinates

• since v2 coordinates are all essentially 0

– This makes it much cheaper to store and compare points– A bigger deal for higher dimensional problems

Page 21: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

A 2D Numerical Example

Page 22: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example – STEP 1

• Subtract the mean

from each of the data dimensions. All the x values have x subtracted and y values have y subtracted from them. This produces a data set whose mean is zero.

Subtracting the mean makes variance and covariance calculation easier by simplifying their equations. The variance and co-variance values are not affected by the mean value.

Page 23: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example – STEP 1

http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf

Page 24: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example – STEP 1

DATA:x y2.5 2.40.5 0.72.2 2.91.9 2.23.1 3.02.3 2.72 1.61 1.11.5 1.61.1 0.9

ZERO MEAN DATA:

x y

.69 .49

-1.31 -1.21

.39 .99

.09 .29

1.29 1.09

.49 .79

.19 -.31

-.81 -.81

-.31 -.31

-.71 -1.01

http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf

Page 25: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 2

• Calculate the covariance matrix

cov = .616555556 .615444444

.615444444 .716555556

• since the non-diagonal elements in this covariance matrix are positive, we should expect that both the x and y variable increase together.

Page 26: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 3

• Calculate the eigenvectors and eigenvalues of the covariance matrix

eigenvalues = 0.049083399

1.28402771

eigenvectors = -.735178656 -.677873399

.677873399 -.735178656

Page 27: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 3

http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf

•eigenvectors are plotted as diagonal dotted lines on the plot. •Note they are perpendicular to each other. •Note one of the eigenvectors goes through the middle of the points, like drawing a line of best fit. •The second eigenvector gives us the other, less important, pattern in the data, that all the points follow the main line, but are off to the side of the main line by some amount.

Page 28: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 4

• Reduce dimensionality and form feature vectorthe eigenvector with the highest eigenvalue is the principle component of the data set.

In our example, the eigenvector with the largest eigenvalue was the one that pointed down the middle of the data.

Once eigenvectors are found from the covariance matrix, the next step is to order them by eigenvalue, highest to lowest. This gives you the components in order of significance.

Page 29: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 4

Now, if you like, you can decide to ignore the components of lesser significance.

You do lose some information, but if the eigenvalues are small, you don’t lose much

• n dimensions in your data • calculate n eigenvectors and eigenvalues• choose only the first p eigenvectors• final data set has only p dimensions.

Page 30: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 4

• Feature Vector

FeatureVector = (eig1 eig2 eig3 … eign)We can either form a feature vector with both of the eigenvectors:

-.677873399 -.735178656 -.735178656 .677873399

or, we can choose to leave out the smaller, less significant component and only have a single column: - .677873399

- .735178656

Page 31: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 5

• Deriving the new dataFinalData = RowFeatureVector x RowZeroMeanData

RowFeatureVector is the matrix with the eigenvectors in the columns transposed so that the eigenvectors are now in the rows, with the most significant eigenvector at the top

RowZeroMeanData is the mean-adjusted data transposed, ie. the data items are in each column, with each row holding a separate dimension.

Page 32: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 5

FinalData transpose: dimensions along columns x y

-.827970186 -.1751153071.77758033 .142857227-.992197494 .384374989-.274210416 .130417207-1.67580142 -.209498461-.912949103 .175282444.0991094375 -.3498246981.14457216 .0464172582.438046137 .01776462971.22382056 -.162675287

Page 33: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCA Example –STEP 5

http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf

Page 34: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Reconstruction of original Data

• If we reduced the dimensionality, obviously, when reconstructing the data we would lose those dimensions we chose to discard. In our example let us assume that we considered only the x dimension…

Page 35: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Reconstruction of original Data

x

-.827970186 1.77758033 -.992197494 -.274210416 -1.67580142 -.912949103 .0991094375 1.14457216 .438046137 1.22382056

http://kybele.psych.cornell.edu/~edelman/Psych-465-Spring-2003/PCA-tutorial.pdf

Page 36: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Appearance-based Recognition

• Directly represent appearance (image brightness), not geometry.

• Why?

Avoids modeling geometry, complex interactions between geometry, lighting and reflectance.

• Why not?

Too many possible appearances!

m “visual degrees of freedom” (eg., pose, lighting, etc)R discrete samples for each DOF

How to discretely sample the DOFs?

How to PREDICT/SYNTHESIS/MATCH with novel views?

Page 37: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Appearance-based Recognition

• Example:

• Visual DOFs: Object type P, Lighting Direction L, Pose R

• Set of R * P * L possible images:

• Image as a point in high dimensional space:

}ˆ{ PRLx

x is an image of N pixels andA point in N-dimensional space

x

Pixel 1 gray value

Pix

el 2

gra

y va

lue

Page 38: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

The Space of Faces

• An image is a point in a high dimensional space– An N x M image is a point in RNM

– We can define vectors in this space as we did in the 2D case

+=

[Thanks to Chuck Dyer, Steve Seitz, Nishino]

Page 39: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Key Idea

}ˆ{ PRLx• Images in the possible set are highly correlated.

• So, compress them to a low-dimensional subspace that captures key appearance characteristics of the visual DOFs.

• EIGENFACES: [Turk and Pentland]

USE PCA!

Page 40: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Eigenfaces

Eigenfaces look somewhat like generic faces.

Page 41: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Problem: Size of Covariance Matrix A

• Suppose each data point is N-dimensional (N pixels)

– The size of covariance matrix A is N x N– The number of eigenfaces is N

– Example: For N = 256 x 256 pixels, Size of A will be 65536 x 65536 ! Number of eigenvectors will be 65536 !

Typically, only 20-30 eigenvectors suffice. So, this method is very inefficient!

Page 42: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Eigenfaces – summary in words

• Eigenfaces are the eigenvectors of the covariance matrix of the probability distribution of the vector space of human faces

• Eigenfaces are the ‘standardized face ingredients’ derived from the statistical analysis of many pictures of human faces

• A human face may be considered to be a combination of these standardized faces

Page 43: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Generating Eigenfaces – in words

1. Large set of images of human faces is taken.2. The images are normalized to line up the eyes,

mouths and other features.3. Any background pixels are painted to the same

color. 4. The eigenvectors of the covariance matrix of

the face image vectors are then extracted.5. These eigenvectors are called eigenfaces.

Page 44: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Eigenfaces for Face Recognition

• When properly weighted, eigenfaces can be summed together to create an approximate gray-scale rendering of a human face.

• Remarkably few eigenvector terms are needed to give a fair likeness of most people's faces.

• Hence eigenfaces provide a means of applying data compression to faces for identification purposes.

Page 45: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Dimensionality Reduction

The set of faces is a “subspace” of the set of images

– Suppose it is K dimensional

– We can find the best subspace using PCA

– This is like fitting a “hyper-plane” to the set of faces

• spanned by vectors v1, v2, ..., vK

Any face:

Page 46: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Eigenfaces

• PCA extracts the eigenvectors of A– Gives a set of vectors v1, v2, v3, ...

– Each one of these vectors is a direction in face space• what do these look like?

Page 47: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Projecting onto the Eigenfaces

• The eigenfaces v1, ..., vK span the space of faces

– A face is converted to eigenface coordinates by

Page 48: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Recognition with Eigenfaces• Algorithm

1. Process the image database (set of images with labels)

• Run PCA—compute eigenfaces• Calculate the K coefficients for each image

2. Given a new image (to be recognized) x, calculate K coefficients

3. Detect if x is a face

4. If it is a face, who is it?

• Find closest labeled face in database• nearest-neighbor in K-dimensional space

Page 49: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Key Property of Eigenspace Representation

Given

• 2 images that are used to construct the Eigenspace

• is the eigenspace projection of image

• is the eigenspace projection of image

Then,

That is, distance in Eigenspace is approximately equal to the correlation between two images.

21 ˆ,ˆ xx

1x

2x1g

2g

||ˆˆ||||ˆˆ|| 1212 xxgg

Page 50: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Choosing the Dimension K

K NMi =

eigenvalues

• How many eigenfaces to use?

• Look at the decay of the eigenvalues

– the eigenvalue tells you the amount of variance “in the direction” of that eigenface

– ignore eigenfaces with low variance

Page 51: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Sample Eigenfaces

Page 52: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

How many principle components are required to obtain human-recognizable reconstructions?

Page 53: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Totally Correct?

• Each new picture is generated by adding (this time) 8 new principle components.

Page 54: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Remove glasses, and lighting change from samples

• Very fast convergence!

Page 55: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Can you recognize non-faces by projecting to orthogonal complement?

• Project onto the Principle Components

• Then regenerate the original picture

Page 56: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

Papers

Page 57: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan
Page 58: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

More Problems: Outliers

Need to explicitly reject outliers before or during computing PCA.

Sample Outliers

Intra-sample outliers

[De la Torre and Black]

Page 59: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

PCAPCA

RPCARPCA

RPCA: Robust PCA, [De la Torre and Black]

Robustness to Intra-sample outliers

Page 60: Principal Components Analysis on Images and Face Recognition Most Slides by S. Narasimhan

OriginalOriginal PCAPCA RPCARPCA OutliersOutliers

Robustness to Sample Outliers

Finding outliers = Tracking moving objects