least squares, singular value decomposition
TRANSCRIPT
Linear Algebra for Computer Vision - part 2
CMSC 828 D
Outline• Background and potpourri• Summation Convention• Eigenvalues and Eigenvectors• Rank and Degeneracy• Gram Schmidt Orthogonalization• Fredholm Alternative Theorem• Least Squares Formulation• Singular Value Decomposition• Applications
Summary: Linear Spaces• n dimensional points in a vector space.
– Length, distance, angles– Dot product (inner product)
• Linear dependence of a set of vectors• Basis : a collection of n independent vectors so
that any vector can be expressed as a sum of these vectors
• Orthogonality <a,b>=0• Orthogonal basis: basis vectors satisfy <bi,bj> = 0• Vector is represented in a particular basis
(coordinate system) <u,bi> = ui
• Linear Manifolds (M): linear spaces that are subsets of the space that are closed under vector addition and scalar multiplication– If vectors u and v belong to the manifold then
so do α1 u + α2v– Manifold must contain zero vector– Essentially a full linear space of
smaller dimension.
• Span of a set of vectors: set of allvectors that can be created by scalar multiplication and addition.
• Vectors in the space that are in the rest of the space are orthogonal to vectors in M. (M⊥ )
• Projection Theorem: any vector in the space X can be written only one way in terms of a vector in M and a vector in M⊥.
Ou
2u
-u
v
Manifold spanned by u
Gram Schmidt Orthogonalization• Given a set of basis vectors (b1,b2, …, bn) construct
an orthonormal basis (e1,e2, …, en) from it.– Set e1= b1/|| b1||– g2= b2 − < b2, e1> e1, e2 = g2/||g2||– For k=3, …,n
gk= bk −Σj< bk, ej> ej, ek = gk/||gk||
Euclidean 3D• Three directions with basis vectors i,j,k or
e1,e2,e3, with ei .ej= δij• Distance between two vectors u and v is ||u-v||• Dot product of two vectors u and v is ||u||.||v|| cos θ• Cross product of two vectors is u×v
– magnitude equal to the area of the parallelogram formed by u and v.
– Magnitude is ||u|| ||v|| sin θ– Direction is perpendicular to u and v
so that the three vectors form a right handed system• Is also written using the permutation symbol εijk
1 2 3
1 2 3
= u u uv v v
×i j k
u v
Summation Convention• Boldface, transpose symbol and summation signs are
tiresome. – Especially if you have to do things such as differentiation
• Vectors can be written in terms of unit basis vectorsa=a1i+a2j+a3k= a1e1+a2e2+a3e3
• However, even this is clumsy. E.g., in 10 dimensionsa= a1e1+a2e2+…+a10e10 =Σi=1
10 aiei
• Notice that index i occurs twice in the expression.– Einstein noticed this always occurred, so whenever index was
repeated twice he avoided writing Σi
– instead of writing Σiaibi, write aibi with the Σi implied
Permutation Symbol• Permutation symbol. εijk
– If i, j and k are in cyclic order εijk =1 • Cyclic => (1,2,3) or (2,3,1) or (3,1,2)
– If in anticyclic order εijk =-1• Anticyclic => (3,2,1) or (2,1,3) or (1,3,2)
– Else, εijk =0• (1,1,2), (2,3,3), …
• c = a×b => ci = εijkajbk• ε δ identity εijk εirs = δjr δks- δjs δkr
– Very useful in proving vector identities• Indicial notation is also essential for working with tensors
– Tensors are essentially linear operators (matrices or their generalizations to higher dimensions)
Examples• Ai Bi in 2 dimensions: A1B1+A2B2
• Aij Bjk in 3D? We have 3 indices here (i,j,k), but only j is repeated twice and so it is Ai1B1k + Ai2B2k+Ai3B3k
• Matrix vector product Ax = Aijxj, Atx = Aijxi
• (a × b). c= εijkajbkci– Using indicial notation can easily show
a .(b×c) = (a × b). c• Homework: show a ×(b ×c)=b(a.c)-c(a.b)
Operators / Matrices• Linear Operator A(α1 u + α2v) = α1 Au + α2 Av• maps one vector to another
Ax=b– m×n dimensional matrix A multiplying a n dimensional vector
x to produce a m dimensional vector b in the dual space• Square matrix of dimension n by n takes vector to
another vector in the same space.• Matrix entries are representations of the matrix using
basis vectors Aij = < Abj,bi>• Eigenvectors are characteristic directions of the matrix.• Matrix decomposition is a factorization of a matrix into
matrices with specific properties.
Rank and Null Space• Range of a m×n dimensional matrix A
Range (A) = {y∈!m: y=Ax for some x ∈!n}• Null space of A is the set of vectors which it takes to zero.
Null(A) = {x ∈!n : Ax = 0}• Rank of a matrix is the dimension of its range.
Rank (A) = Rank (At)– Maximal number of independent rows or columns
• Dimension of Null(A)+Rank(A) = n
Norm of a matrix• ||A|| ≥0 ||Ax||≤||A|| ||x|| • ||A||F =[aij aij]1/2 Froebenius norm. If A is diagonal
||A||F=[a112+ a22
2+…+ ann2 ]1/2
• ||A||2 = maxx ||Ax||2/||x||2. Can show 2 norm = square root of largest eigenvalue of AtA
Orthogonality• Two vectors are orthogonal if <u,v>=0• Orthogonal matrix is composed of
orthogonal vectors as columns.p.q=0
• Usually represented as Q• By definition QQt=I• Matrices that rotate coordinate axes are
orthogonal matrices
1 1 1
2 2 2
n n n
q p rq p r
q p r
"#
# # $ #"
Rotation in 2D and 3D• Rotation through an angle θ
• Rotation + translation
• Rotation in 3D – φ about z axis, θ about new x axis,
ψ about new y axis.
θ
y
x
x’
y’
'
'
cos sinsin cos
xxyy
θ θθ θ
= −
' '1 1
' '2 2
cos sin cos sinsin cos sin cos
t x pxx xt y pyy y
θ θ θ θθ θ θ θ
+ = + = +− −
' cos cos cos sin sin cos sin cos cos sin sin sin' sin cos cos sin cos sin sin cos cos cos cos sin' sin sin sin cos cos
x xy yz z
ψ φ θ φ ψ ψ φ θ φ ψ ψ θψ φ θ φ ψ ψ φ θ φ ψ ψ θ
θ φ θ φ θ
− + = − − − + −
Rotation matrix• Rotates a vector represented in one
orthogonal coordinate system into a vector in another coordinate system.– Since length of vector should not change||Qx||=||x|| for all x– Since Q will not change a vector along
coordinate directions QQt = I– Columns of Q are its eigenvectors.– Eigenvalues are all 1.
Similarity Transforms• Transforms vector represented in one basis to vector in
another basis• Let X={x1,…,xn} and Y={y1,…,yn} be two bases in a n
dimensional space– There exists a transformation A which takes a vector expressed in
X to one expressed in Y, – Inverse transformation A-1 from Y to X also exists.
• Let B and C be two matrices. Then ifC=A-1B A
B and C represent the same matrix transformation with respect to different bases and are called Similar Matrices.
• If A is orthogonal then C=AtBA
andi i i i i ij jAα β β= = =u x u y x
Eigenvalue problem
Remarks: Eigenvalues and Eigenvectors• Eigenvalues and Eigenvectors of a real symmetric matrix are real.• In general since eigenvalues are determined by solving a
polynomial equation, they can be complex.• Further roots can be repeated ! multiple eigenvectors correspond
to a single eigenvalue.• Transforming matrix into eigenbasis yields a diagonal matrix.
QtAQ=ΛΛΛΛ ΛΛΛΛ is a matrix of eigenvalues– Knowing the eigenvectors we can solve an equation Ax=b. Rewrite
it as QtAQQtx=Qtb ΛΛΛΛy=f
– Where y=Qtx and f= Qtb – Can get x from y x=(Qt)-1y = Qy
• Determinant is unchanged by an orthogonal transformation. • Determinant: Det(A) = λ1 λ2 …λn
When is Ax=b Solvable?• When does the equation Ax=b have a solution?
– Usual answer is if A is invertible– However in many situations where A is singular there still may
be a meaningful solution.
• Fredholm Alternative Theorem.– Look at the homogeneous systems
Ax=0 (1) A*y=0 (2)– If (1) has only the trivial solution then so does (2). This occurs
only if det(A) ≠ 0 (if A is invertible).Then Ax=b has a unique solution x=A-1b
– If (1) has nontrivial solutions then det(A)=0.• This means rows of A have interdependencies. In this case b must
reflect those dependencies
• If 2nd row of A is a sum of the 1st and 3rd rows, then b2=b1+b3
• If there are k independent solutions to equation (1) then A has a kdimensional nullspace.
• A* also has a k dimensional nullspace (but with different solutions). – Let these solutions be n*1, n*2, …, n*k
• For Ax=b can have solutions iff<b,n*j>=0 j = 1,…,k
• b must be orthogonal to the nullspace of A*.
• Any solution with y2=-y1 and y3=y1 satisfies the adjoint equationor the nullspace of A* is α [1,-1,1]t
• Here <b,n*1>= -1(≠0). So equation has no solution. • However if b=[1, 2, 1]t we would have a solution• General solution is x = x~+ckn*k where x~ is a particular solution.
1 1
2 2
3 3
1 1 1 1 1 2 1 02 1 1 3 or 1 1 2 01 2 0 1 1 1 0 0
x yx yx y
− = = − − =
−
Ax b
Least Squares• Number of equations and unknowns may not match• Look for solution by maximizing ||Ax - b||• (Aijxj-bi).(Aikxk-bi) with respect to xl• Recall
• Same as the solution of AtAx=Atb• Shows the power of the index notation
– See again the appearance of AtA
( )
( - ) ( - ) =0
( ) ( - ) ( - ) ( ) 0
( - ) ( - ) 2 0
ij j i ik k il
ij jl ik k i ij j i ik kl
il ik k i ij j i il il ik k il i
il ik k il i
A x b A x bxA A x b A x b A
A A x b A x b A A A x A bA A x A b
δ δ
∂∂
+ =
+ = − =
=
%
% %
% %
iil
l
xx
δ∂ =∂
Singular Value Decomposition• Chief tool for dealing with m by n systems and singular
systems.• Singular values: Non negative square roots of the
eigenvalues of AtA. Denoted σi, i=1,…,n– AtA is symmetric " eigenvalues and singular values are real.
• SVD: If A is a real m by n matrix then there exist orthogonal matrices U (∈!m×m) and V (∈!n×n) such that UtAV= Σ =diag(σ1, σ2,…, σp) p=min{m,n}
A= U Σ Vt
• Geometrically, singular values are the lengths of the hyperellipsoid defined by E={Ax: ||x||2=1}
• Singular values arranged in decreasing order.
Properties of the SVD• Suppose we know the singular values of A and we know
r are non zeroσ1≥ σ2 ≥ … ≥ σr ≥ σr+1 = … = σp=0
– Rank(A) = r. – Null(A) = span{vr+1,…,vn}– Range(A)=span{u1,…,ur}
• ||A||F2= σ1
2+ σ22+…+ σp
2 ||A||2= σ1
• Numerical rank: If k singular values of A are larger than a given number ε. Then the ε rank of A is k.
• Distance of a matrix of rank n from being a matrix of rank k = σk+1
Why is it useful?• Square matrix may be singular due to round-off errors.
Can compute a “regularized” solution– x=A-1b=(U Σ Vt )-1b =
• If σi is small (vanishes) the solution “blows up”• Given a tolerance ε we can determine a solution that is
“closest” to the solution of the original equation, but that does not “blow up”
• Least squares solution is the x that satisfies AtAx=Atb
• can be effectively solved using SVD
1
tni
ii iσ=∑ u bv
11
,tki
r i k ki i
σ ε σ εσ +
=
= > ≤∑ u bx v