lecture: face recognition and feature reductioncs131.stanford.edu/files/12_svd_pca.pdf · stanford...
TRANSCRIPT
Lecture 11 -Stanford University
Lecture:FaceRecognitionandFeatureReduction
JuanCarlosNiebles andRanjayKrishnaStanfordVisionandLearningLab
2-Nov-171
Lecture 11 -Stanford University
Recap- Curseofdimensionality• Assume5000pointsuniformlydistributedintheunit
hypercubeandwewanttoapply5-NN.Supposeourquerypointisattheorigin.– In1-dimension,wemustgoadistanceof5/5000=0.001onthe
averagetocapture5nearestneighbors.– In2dimensions,wemustgotogetasquarethatcontains0.001
ofthevolume.– Inddimensions,wemustgo
31-Oct-172
0.001
0.001( )1/d
Lecture 11 -Stanford University
Whatwewilllearntoday• Singularvaluedecomposition• PrincipalComponentAnalysis(PCA)• Imagecompression
2-Nov-173
Lecture 11 -Stanford University
Whatwewilllearntoday• Singularvaluedecomposition• PrincipalComponentAnalysis(PCA)• Imagecompression
2-Nov-174
Lecture 11 -Stanford University
SingularValueDecomposition(SVD)
• Thereareseveralcomputeralgorithmsthatcan“factorize”amatrix,representingitastheproductofsomeothermatrices
• ThemostusefuloftheseistheSingularValueDecomposition.
• RepresentsanymatrixA asaproductofthreematrices:UΣVT
• Pythoncommand:– [U,S,V]= numpy.linalg.svd(A)
2-Nov-175
Lecture 11 -Stanford University
SingularValueDecomposition(SVD)
UΣVT =A• WhereU andV arerotationmatrices,andΣisascalingmatrix.Forexample:
2-Nov-176
Lecture 11 -Stanford University
SingularValueDecomposition(SVD)• Beyond2x2matrices:
– Ingeneral,ifA ism xn,thenU willbemxm, Σ willbem xn,andVT willben xn.
– (Notethedimensionsworkouttoproducem xnaftermultiplication)
2-Nov-177
Lecture 11 -Stanford University
SingularValueDecomposition(SVD)
• U andV arealwaysrotationmatrices.– Geometricrotationmaynotbeanapplicableconcept,dependingonthematrix.Sowecallthem“unitary”matrices– eachcolumnisaunitvector.
• Σisadiagonalmatrix– Thenumberofnonzeroentries=rankofA– Thealgorithmalwayssortstheentrieshightolow
2-Nov-178
Lecture 11 -Stanford University
SVDApplications
• We’vediscussedSVDintermsofgeometrictransformationmatrices
• ButSVDofanimagematrixcanalsobeveryuseful
• Tounderstandthis,we’lllookatalessgeometricinterpretationofwhatSVDisdoing
2-Nov-179
Lecture 11 -Stanford University
SVDApplications
• Lookathowthemultiplicationworksout,lefttoright:• Column1ofU getsscaledbythefirstvaluefromΣ.
• Theresultingvectorgetsscaledbyrow1ofVT toproduceacontributiontothecolumnsofA
2-Nov-1710
Lecture 11 -Stanford University
SVDApplications
• Eachproductof(columni ofU)·(valuei fromΣ)·(rowi ofVT)producesacomponentofthefinalA.
2-Nov-1711
+
=
Lecture 11 -Stanford University
SVDApplications
• We’rebuildingAasalinearcombinationofthecolumnsof U• UsingallcolumnsofU,we’llrebuildtheoriginalmatrixperfectly• But,inreal-worlddata,oftenwecanjustusethefirstfew
columnsofU andwe’llgetsomethingclose(e.g.thefirstApartial,above)
2-Nov-1712
Lecture 11 -Stanford University
SVDApplications
• Wecancallthosefirstfewcolumnsof UthePrincipalComponents ofthedata
• Theyshowthemajorpatternsthatcanbeaddedtoproducethecolumnsoftheoriginalmatrix
• TherowsofVT showhowtheprincipalcomponents aremixedtoproducethecolumnsofthematrix
2-Nov-1713
Lecture 11 -Stanford University
SVDApplications
WecanlookatΣtoseethatthefirstcolumnhasalargeeffect
2-Nov-1714
whilethesecondcolumnhasamuchsmallereffectinthisexample
Lecture 11 -Stanford University
SVDApplications
• Forthisimage,usingonlythefirst10 of300principalcomponentsproducesarecognizablereconstruction
• So,SVDcanbeusedforimagecompression
2-Nov-1715
Lecture 11 -Stanford University
SVDforsymmetricmatrices
• IfAisasymmetricmatrix,itcanbedecomposedasthefollowing:
• ComparedtoatraditionalSVDdecomposition,U=VT andisanorthogonalmatrix.
2-Nov-1716
Lecture 11 -Stanford University
PrincipalComponentAnalysis
• Remember,columnsof UarethePrincipalComponents ofthedata:themajorpatternsthatcanbeaddedtoproducethecolumnsoftheoriginalmatrix
• Oneuseofthisistoconstructamatrixwhereeachcolumnisaseparatedatasample
• RunSVDonthatmatrix,andlookatthefirstfewcolumnsofUtoseepatternsthatarecommonamongthecolumns
• ThisiscalledPrincipalComponentAnalysis (orPCA)ofthedatasamples
2-Nov-1717
Lecture 11 -Stanford University
PrincipalComponentAnalysis
• Often,rawdatasampleshavealotofredundancyandpatterns• PCAcanallowyoutorepresentdatasamplesasweightsonthe
principalcomponents,ratherthanusingtheoriginalrawformofthedata
• Byrepresentingeachsampleasjustthoseweights,youcanrepresentjustthe“meat”ofwhat’sdifferentbetweensamples.
• Thisminimalrepresentationmakesmachinelearningandotheralgorithmsmuchmoreefficient
2-Nov-1718
Lecture 11 -Stanford University
HowisSVDcomputed?
• Forthisclass:tellPYTHONtodoit.Usetheresult.
• But,ifyou’reinterested,onecomputeralgorithmtodoitmakesuseofEigenvectors!
2-Nov-1719
Lecture 11 -Stanford University
Eigenvectordefinition
• SupposewehaveasquarematrixA.Wecansolveforvectorxandscalarλ suchthatAx= λx
• Inotherwords,findvectorswhere,ifwetransformthemwithA,theonlyeffectistoscalethemwithnochangeindirection.
• Thesevectorsarecalledeigenvectors(Germanfor“selfvector”ofthematrix),andthescalingfactorsλ arecalledeigenvalues
• Anm xmmatrixwillhave≤m eigenvectorswhereλ isnonzero
2-Nov-1720
Lecture 11 -Stanford University
Findingeigenvectors• ComputerscanfindanxsuchthatAx= λxusingthisiterativealgorithm:
– X=randomunitvector– while(xhasn’tconverged)
• X=Ax• normalizex
• xwillquicklyconvergetoaneigenvector• Somesimplemodificationswillletthisalgorithmfindalleigenvectors
2-Nov-1721
Lecture 11 -Stanford University
FindingSVD
• Eigenvectorsareforsquarematrices,butSVDisforallmatrices
• Todosvd(A),computerscandothis:– TakeeigenvectorsofAAT(matrixisalwayssquare).
• TheseeigenvectorsarethecolumnsofU.• Squarerootofeigenvalues arethesingularvalues(theentriesofΣ).
– TakeeigenvectorsofATA(matrixisalwayssquare).• TheseeigenvectorsarecolumnsofV (orrowsofVT)
2-Nov-1722
Lecture 11 -Stanford University
FindingSVD
• Moralofthestory:SVDisfast,evenforlargematrices• It’susefulforalotofstuff• TherearealsootheralgorithmstocomputeSVDorpartof
theSVD– Python’snp.linalg.svd()commandhasoptionstoefficiently
computeonlywhatyouneed,ifperformancebecomesanissue
2-Nov-1723
AdetailedgeometricexplanationofSVDishere:http://www.ams.org/samplings/feature-column/fcarc-svd
Lecture 11 -Stanford University
Whatwewilllearntoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression
2-Nov-1724
Lecture 11 -Stanford University
Covariance• VarianceandCovarianceareameasureofthe“spread”ofasetofpointsaroundtheircenterofmass(mean)
• Variance– measureofthedeviationfromthemeanforpointsinonedimensione.g.heights
• Covarianceasameasureofhowmucheachofthedimensionsvaryfromthemeanwithrespecttoeachother.
• Covarianceismeasuredbetween2dimensionstoseeifthereisarelationshipbetweenthe2dimensionse.g.numberofhoursstudied&marksobtained.
• Thecovariancebetweenonedimensionanditselfisthevariance
2-Nov-1725
Lecture 11 -Stanford University
Covariance
• So,ifyouhada3-dimensionaldataset(x,y,z),thenyoucouldmeasurethecovariancebetweenthexandydimensions,theyandzdimensions,andthexandzdimensions.Measuringthecovariancebetweenxandx,oryandy,orzandzwouldgiveyouthevarianceofthex,yandzdimensionsrespectively
2-Nov-1726
Lecture 11 -Stanford University
Covariancematrix• RepresentingCovariancebetweendimensionsasamatrixe.g.for3dimensions
• Diagonalisthevariancesofx,yandz• cov(x,y)=cov(y,x)hencematrixissymmetricalaboutthediagonal
• N-dimensionaldatawillresultinNxN covariancematrix
2-Nov-1727
Lecture 11 -Stanford University
Covariance
• Whatistheinterpretationofcovariancecalculations?– e.g.:2dimensionaldataset– x:numberofhoursstudiedforasubject– y:marksobtainedinthatsubject– covariancevalueissay:104.53– whatdoesthisvaluemean?
2-Nov-1728
Lecture 11 -Stanford University
Covarianceinterpretation
2-Nov-1729
Lecture 11 -Stanford University
Covarianceinterpretation• Exactvalueisnotasimportantasit’ssign.• Apositivevalueofcovarianceindicatesbothdimensionsincreaseordecreasetogethere.g.asthenumberofhoursstudiedincreases,themarksinthatsubjectincrease.
• Anegativevalueindicateswhileoneincreasestheotherdecreases,orvice-versae.g.activesociallifeatPSUvsperformanceinCSdept.
• Ifcovarianceiszero:thetwodimensionsareindependentofeachothere.g.heightsofstudentsvsthemarksobtainedinasubject
2-Nov-1730
Lecture 11 -Stanford University
Exampledata
2-Nov-1731
Covariancebetweenthetwoaxisishigh.Canwereducethenumberofdimensionstojust1?
Lecture 11 -Stanford University
GeometricinterpretationofPCA
2-Nov-1732
Lecture 11 -Stanford University
GeometricinterpretationofPCA
• Let’ssaywehaveasetof2Ddatapointsx.Butweseethatallthepointslieonalinein2D.
• So,2dimensionsareredundanttoexpressthedata.Wecanexpressallthepointswithjustonedimension.
2-Nov-1733
1Dsubspacein2D
Lecture 11 -Stanford University
PCA:PrincipleComponentAnalysis
• Givenasetofpoints,howdoweknowiftheycanbecompressedlikeinthepreviousexample?– Theansweristolookintothecorrelationbetweenthepoints
– ThetoolfordoingthisiscalledPCA
2-Nov-1734
Lecture 11 -Stanford University
PCAFormulation• Basicidea:
– Ifthedatalivesinasubspace,itisgoingtolookveryflatwhenviewedfromthefullspace,e.g.
2-Nov-1735
SlideinspiredbyN.Vasconcelos
1Dsubspacein2D 2Dsubspacein3D
Lecture 11 -Stanford University
PCAFormulation• AssumexisGaussianwith
covarianceΣ.
• Recallthatagaussian isdefinedwithit’smeanandvariance:
• Recallthat μ andΣ ofagaussian aredefinedas:
2-Nov-1736
x1
x2
λ1λ2
φ1
φ2
Lecture 11 -Stanford University
PCAformulation
• Sincegaussians aresymmetric,it’scovariancematrixisalsoasymmetricmatrix.Sowecanexpressitas:– Σ = UΛUT = UΛ1/2(UΛ1/2)T
2-Nov-1737
Lecture 11 -Stanford University
PCAFormulation• IfxisGaussianwithcovarianceΣ,
– Principalcomponentsφi aretheeigenvectorsofΣ– Principallengthsλi aretheeigenvaluesofΣ
• bycomputingtheeigenvaluesweknowthedatais– Notflatifλ1 ≈λ2– Flatifλ1 >>λ2
2-Nov-1738
SlideinspiredbyN.Vasconcelos
x1
x2
λ1λ2
φ1
φ2
Lecture 11 -Stanford University
PCAAlgorithm(training)
2-Nov-1739
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAAlgorithm(testing)
2-Nov-1740
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD• Analternativemannertocomputetheprincipalcomponents,
basedonsingularvaluedecomposition• Quickreminder:SVD
– Anyrealnxmmatrix(n>m)canbedecomposedas
– WhereMisan(nxm)columnorthonormalmatrixofleftsingularvectors(columnsofM)
– Π isan(mxm)diagonalmatrixofsingularvalues– NT isan(mxm)roworthonormalmatrixofrightsingularvectors
(columnsofN)
2-Nov-1741
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD• TorelatethistoPCA,weconsiderthedatamatrix
• Thesamplemeanis
2-Nov-1742
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD• CenterthedatabysubtractingthemeantoeachcolumnofX• Thecentereddatamatrix is
2-Nov-1743
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD• Thesamplecovariance matrixis
wherexic istheith columnofXc• Thiscanbewrittenas
2-Nov-1744
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD• Thematrix
isreal(nxd).Assumingn>dithasSVDdecomposition
and
2-Nov-1745
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD
• NotethatNis(dxd)andorthonormal,andΠ2 isdiagonal.ThisisjusttheeigenvaluedecompositionofΣ
• Itfollowsthat– TheeigenvectorsofΣ arethecolumnsofN– TheeigenvaluesofΣ are
• ThisgivesanalternativealgorithmforPCA
2-Nov-1746
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
PCAbySVD• Insummary,computationofPCAbySVD• GivenXwithoneexamplepercolumn
– Createthecentereddatamatrix
– ComputeitsSVD
– PrincipalcomponentsarecolumnsofN,eigenvaluesare
2-Nov-1747
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
RuleofthumbforfindingthenumberofPCAcomponents
• Anaturalmeasureistopicktheeigenvectorsthatexplainp%ofthedatavariability– Canbedonebyplottingtheratiork asafunctionofk
– E.g.weneed3eigenvectorstocover70%ofthevariabilityofthisdataset
2-Nov-1748
SlideinspiredbyN.Vasconcelos
Lecture 11 -Stanford University
Whatwewilllearntoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression
2-Nov-1749
Lecture 11 -Stanford University
OriginalImage
• Divide the original 372x492 image into patches:• Each patch is an instance that contains 12x12 pixels on a grid
• View each as a 144-D vector2-Nov-1750
Lecture 11 -Stanford University
L2 errorandPCAdim
2-Nov-1751
Lecture 11 -Stanford University
PCAcompression:144D) 60D
2-Nov-1752
Lecture 11 -Stanford University
PCAcompression:144D) 16D
2-Nov-1753
Lecture 11 -Stanford University
16mostimportanteigenvectors
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2-Nov-1754
Lecture 11 -Stanford University
PCAcompression:144D) 6D
2-Nov-1755
Lecture 11 -Stanford University
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
2 4 6 8 10 12
24681012
6mostimportanteigenvectors
2-Nov-1756
Lecture 11 -Stanford University
PCAcompression:144D) 3D
2-Nov-1757
Lecture 11 -Stanford University
2 4 6 8 10 12
2
4
6
8
10
122 4 6 8 10 12
2
4
6
8
10
12
2 4 6 8 10 12
2
4
6
8
10
12
3 most important eigenvectors
2-Nov-1758
Lecture 11 -Stanford University
PCAcompression:144D) 1D
2-Nov-1759
Lecture 11 -Stanford University
Whatwehavelearnedtoday• Introductiontofacerecognition• PrincipalComponentAnalysis(PCA)• Imagecompression
2-Nov-1760