brief introduction to pca & svd 2008.09.09 byung-hyun ha [email protected]
TRANSCRIPT
Contents
Principle component analysis (PCA)
Singular value decomposition (SVD)
Principle Component Analysis
Steps Get some data Subtract the mean Calculate the covariance matrix Calculate the eigenvectors and eigenvalues of the covariance m
atrix Choosing components and forming a feature vector Deriving the new data set Getting the old data back
Principle Component Analysis
Get some data
Principle Component Analysis
Subtract the mean
Principle Component Analysis
Calculate the covariance matrix
Calculate the eigenvectors and eigenvalues of the covariance matrix
Principle Component Analysis
Choosing components and forming feature vector (1)
Deriving the newdata set (1) FinalData1 =
DataAdjust FeatureVector1
Principle Component Analysis
Choosing components and forming feature vector (2)
Deriving the newdata set (2) FinalData2 =
DataAdjust FeatureVector2
Getting the olddata back DataAdjust' =
FinalData2 FeatureVector2
T
Principle Component Analysis
이론적으로 보자면 , 가장 중요한 성분 (component) w1 은 다음과 같이 구해짐
• 여기서 , x는 data point 의 확률변수 ( 벡터 ) 임
k-1 개의 요인을 제거한 후 가장 주요한 성분이 wk 임
위의 조건을 만족하는 모든 성분을 구하는 것은 최적화 문제이며 공분산 행렬 C 의 특성행렬을 구함으로써 계산할 수 있음
• C = XXT = WWT
• 여기서 XT 는 data point 의 row 행렬 , W 는 특성행렬 , 는 대각행렬
Singular Value Decomposition
Singular value decomposition Any m by n matrix A can be factored into
A = Q1Q2T = (orthogonal)(diagonal)(orthogonal).
The columns of Q1 (m by m) are eigenvectors of AAT, and the columns of Q2 (n by n) are eigenvectors of ATA. The r singular values on the diagonal of (m by n) are the square roots of the nonzero eigenvalues of both AAT and ATA.
Example (by MATLAB)[2 5 8 7] [-0.54 -0.38 0.62 -0.34] [21.72 0 0 0 ] [-0.20 -0.47 -0.56 -0.66][3 5 7 6] [-0.50 -0.42 -0.23 0.42] [ 0 3.81 0 0 ] [-0.33 0.24 -0.73 0.55][1 6 4 9] = [-0.51 0.82 0.15 0.00] X [ 0 0 1.43 0 ] X [-0.92 0.06 0.37 -0.08][2 2 3 4] [-0.26 -0.05 -0.65 -0.72] [ 0 0 0 0.91] [ 0.05 0.85 -0.13 -0.51][2 4 4 5] [-0.36 0.03 -0.36 0.44]
Singular Value Decomposition
Applications Image compression
[-0.54 -0.38 0.62 -0.34] [21.72 0 0 0 ] [-0.20 -0.47 -0.56 -0.66] [2.0 5.0 8.0 7.0][-0.50 -0.42 -0.23 0.42] [ 0 3.81 0 0 ] [-0.33 0.24 -0.73 0.55] [3.0 5.0 7.0 6.0][-0.51 0.82 0.15 0.00] X [ 0 0 1.43 0 ] X [-0.92 0.06 0.37 -0.08] = [1.0 6.0 4.0 9.0][-0.26 -0.05 -0.65 -0.72] [ 0 0 0 0.91] [ 0.05 0.85 -0.13 -0.51] [2.0 2.0 3.0 4.0][-0.36 0.03 -0.36 0.44] [2.0 4.0 4.0 5.0]
[-0.54 -0.38 0.62 -0.34] [21.72 0 0 0 ] [-0.20 -0.47 -0.56 -0.66] [2.0 5.3 8.0 6.8][-0.50 -0.42 -0.23 0.42] [ 0 3.81 0 0 ] [-0.33 0.24 -0.73 0.55] [3.0 4.7 7.0 6.2][-0.51 0.82 0.15 0.00] X [ 0 0 1.43 0 ] X [-0.92 0.06 0.37 -0.08] = [1.0 6.0 4.0 9.0][-0.26 -0.05 -0.65 -0.72] [ 0 0 0 0.00] [ 0.05 0.85 -0.13 -0.51] [2.0 2.6 2.9 3.7][-0.36 0.03 -0.36 0.44] [2.0 3.7 4.1 5.2]
[-0.54 -0.38 0.62 -0.34] [21.72 0 0 0 ] [-0.20 -0.47 -0.56 -0.66] [2.8 5.2 7.6 6.9][-0.50 -0.42 -0.23 0.42] [ 0 3.81 0 0 ] [-0.33 0.24 -0.73 0.55] [2.7 4.7 7.2 6.2][-0.51 0.82 0.15 0.00] X [ 0 0 0.00 0 ] X [-0.92 0.06 0.37 -0.08] = [1.2 6.0 4.0 9.0][-0.26 -0.05 -0.65 -0.72] [ 0 0 0 0.00] [ 0.05 0.85 -0.13 -0.51] [1.2 2.7 3.3 3.6][-0.36 0.03 -0.36 0.44] [1.5 3.7 4.2 5.2]
[-0.54 -0.38 0.62 -0.34] [21.72 0 0 0 ] [-0.20 -0.47 -0.56 -0.66] [2.4 5.6 6.6 7.7][-0.50 -0.42 -0.23 0.42] [ 0 0.00 0 0 ] [-0.33 0.24 -0.73 0.55] [2.2 5.1 6.0 7.1][-0.51 0.82 0.15 0.00] X [ 0 0 0.00 0 ] X [-0.92 0.06 0.37 -0.08] = [2.2 5.3 6.2 7.3][-0.26 -0.05 -0.65 -0.72] [ 0 0 0 0.00] [ 0.05 0.85 -0.13 -0.51] [1.1 2.7 3.1 3.7][-0.36 0.03 -0.36 0.44] [1.6 3.7 4.3 5.1]