golub-kahan iterative bidiagonalization and determining ... · orthogonal projections-based model...
TRANSCRIPT
Golub-Kahan iterative bidiagonalization and
determining the noise level in the data
Iveta Hnetynkova, Marie Michenkova, Martin Plesinger,
Zdenek Strakos
Charles University in Prague
Technical University Liberec
Academy of Sciences of the Czech republic, Prague
TU Liberec & Preciosa, a.s., September 2013
1
Six tons large real world ill-posed problem:
2
Solving large scale discrete ill-posed problems is frequently based upon
orthogonal projections-based model reduction using Krylov sub-
spaces, see, e.g., hybrid methods. This can be viewed as
approximation of a Riemann-Stieltjes distribution function
via matching moments.
3
Outline
1. Problem formulation
2. Propagation of noise in the Golub-Kahan iterative bidiagonaliza-
tion
3. Numerical illustration
4
The underlying problem is a linear algebraic system
Ax ≈ b
which can arise, e.g., in discretization of a Fredholm integral equation
of the 1st kind
b(s)exact =
∫
K(s, t) x(t) dt ≡ A x(t).
The right-hand side b is contaminated by noise
b = bexact + bnoise , δnoise ≡‖bnoise‖
‖bexact‖≪ 1 .
The goal is to approximate
xexact ≡ A† bexact.
5
6
Singular value decomposition in discrete ill-posed problems
A U S VT
A U Σ V T
A A∗
v1σ1−→ u1
σ1−→ v1 ,
v2σ2−→ u2
σ2−→ v2 ,... ... ... ... ...
vrσr−→ ur
σr−→ vr ,
vr+1...
vm
→ ≈ 0 ,ur+1...
un
→ ≈ 0 .
7
Properties (assumptions):
• matrices A, AT , AAT have a smoothing property;
• left singular vectors uj of A represent increasing frequencies
as j increases;
• the system A xexact = bexact satisfies the discrete Picard con-
dition.
Discrete Picard condition (DPC):
On average, the components |(bexact, uj)| of the true right-hand
side bexact in the left singular subspaces of A decay faster than
the singular values σj of A, j = 1, . . . , n .
8
Left singular vectors of A represent a basis with increasing fre-
quencies; reshaped right singular vectors of A (singular images) for
the Gaussian blur
0 200 400−0.1
0
0.1
u1
0 200 400−0.1
0
0.1
u2
0 200 400−0.1
0
0.1
u3
0 200 400−0.1
0
0.1
u4
0 200 400−0.1
0
0.1
u5
0 200 400−0.1
0
0.1
u6
0 200 400−0.1
0
0.1
u7
0 200 400−0.1
0
0.1
u8
0 200 400−0.1
0
0.1
u9
0 200 400−0.1
0
0.1
u10
0 200 400−0.1
0
0.1
u11
0 200 400−0.1
0
0.1
u12
9
Using the SVD the solution of Ax = b can be written as
x = A−1b = V Σ−1UT b =∑N
j=1
uTj b
σjvj.
Recall that σ1 ≥ σ2 ≥ . . . ≥ σN and the exact components and the
noise components behave differently,
x =∑N
j=1
uTj bexact
σjvj +
∑N
j=1
uTj bnoise
σjvj .
10
Outline
1. Problem formulation
2. Propagation of noise in the Golub-Kahan bidiagonalization
3. Numerical illustration
11
Krylov subspace methods are projection methods. Golub-Kahan
iterative bidiagonalization (GK) of A :
Given w0 = 0 , s1 = b / β1 , where β1 = ‖b‖ , for j = 1,2, . . .
αj wj = AT sj − βj wj−1 , ‖wj‖ = 1 ,
βj+1 sj+1 = A wj − αj sj , ‖sj+1‖ = 1 .
Sk = [s1, . . . , sk] , Wk = [w1, . . . , wk] , STk AWk ≡ Lk , where Lk
is lower bidiagonal, Sk and Wk have orthonormal columns.
GK starts with the normalized noisy right-hand side s1 = b / ‖b‖ .
Consequently, vectors sj contain information about the noise. Can
this information be used to estimate the noise level?
12
Components of several bidiagonalization vectors sj
computed via GK with double reorthogonalization:
0 200 400−0.2
−0.1
0
0.1
0.2
s1
0 200 400−0.2
−0.1
0
0.1
0.2
s6
0 200 400−0.2
−0.1
0
0.1
0.2
s11
0 200 400−0.2
−0.1
0
0.1
0.2
s16
0 200 400−0.2
−0.1
0
0.1
0.2
s17
0 200 400−0.2
−0.1
0
0.1
0.2
s18
0 200 400−0.2
−0.1
0
0.1
0.2
s19
0 200 400−0.2
−0.1
0
0.1
0.2
s20
0 200 400−0.2
−0.1
0
0.1
0.2
s21
0 200 400−0.2
−0.1
0
0.1
0.2
s22
13
The first 80 spectral coefficients of the vectors sj
in the basis of the left singular vectors uj of A:
20 40 60 80
100
UTs1
20 40 60 80
100
UTs6
20 40 60 80
100
UTs11
20 40 60 80
100
UTs16
20 40 60 80
100
UTs17
20 40 60 80
100
UTs18
20 40 60 80
100
UTs19
20 40 60 80
100
UTs20
20 40 60 80
100
UTs21
20 40 60 80
100
UTs22
14
Noise is amplified with the ratio αk/βk+1 :
GK for the spectral components:
α1 (V Tw1) = Σ(UTs1) ,
β2 (UTs2) = Σ(V Tw1) − α1 (UTs1) ,
and for k = 2,3, . . .
αk(VTwk) = Σ(UTsk) − βk(V
Twk−1) ,
βk+1(UTsk+1) = Σ(V Twk) − αk(U
Tsk) .
15
Since the dominance in Σ(UT sk) and (V Twk−1) is shifted by one
component, in αk (V Twk) = Σ(UTsk) − βk (V Twk−1) one can notexpect a significant cancelation, and therefore
αk ≈ βk .
Whereas Σ(V Twk) and (UT sk) do exhibit the dominance in the di-rection of the same components. If this dominance is strong enough,
then the required orthogonality of sk+1 and sk in
βk+1 (UTsk+1) = Σ(V Twk) − αk (UT sk)
can not be achieved without a significant cancelation, and one canexpect
βk+1 ≪ αk .
16
Noise level estimation :
GK is closely related to the Lanczos tridiagonalization of the sym-
metric matrix A AT with the starting vector s1 = b / β1.
Spectral properties of
Tk ≡ Lk LTk =
α21 α1 β1
α1 β1 α22 + β2
2. . .
. . . . . . αk−1 βk
αk−1 βk α2k + β2
k
determine an approximation of the Riemann-Stieltjes distribution func-
tion related to the original mapping A.
17
Consider the SVD of the bidiagonal matrix
Lk = Pk Θk QkT ,
Pk = [p(k)1 , . . . , p
(k)k ] , Qk = [q
(k)1 , . . . , q
(k)k ] , Θk = diag (θ
(k)1 , . . . , θ
(k)n ) ,
0 < θ(k)1 < . . . < θ
(k)k .
The weight |(p(k)1 , e1)|
2 of the approximate distribution function cor-
responding to the smallest (θ(k)1 )2 is strictly decreasing. At the so
called noise revealing iteration, it sharply starts to (almost) stagnate
on the level close to the squared noise level δ2noise.
18
Square roots of the weights (left), approximation of ω(λ) (right):
0 5 10 15 20
10−4
10−3
10−2
10−1
100
iteration number k
δnoise
knoise
= 8 − 1 = 7
10−40
10−20
100
10−8
10−6
10−4
10−2
100
the leftmost node
wei
ght
ω(λ): nodes σj2, weights |(b/β
1,u
j)|2
nodes (θ1(k))2, weights |(p
1(k),e
1)|2
δ2noise
19
Outline
1. Problem formulation
2. Propagation of noise in the Golub-Kahan bidiagonalization
3. Numerical illustration
20
Image deblurring problem, image size 324 × 470 pixels,
problem dimension n = 152280, the exact solution (left) and
the noisy right-hand side (right), δnoise = 3 × 10−3.
xexact bexact + bnoise
21
Square roots of the weights |(p(k)1 , e1)|
2, k = 1, 2, . . . (top)
and error history of LSQR solutions (bottom):
0 20 40 60 80 10010
−3
10−2
10−1
100
| ( p1(k) , e
1 ) |, || bnoise ||
2 / || bexact ||
2 = 0.003
| ( p1(k) , e
1 ) |
δnoise
0 20 40 60 80 100
103.7
103.8
103.9
Error history
|| xkLSQR − xexact ||
the minimal error, k = 41
22
The best LSQR reconstruction (left), xLSQR41 ,
and the corresponding componentwise error (right).
GK without any reorthogonalization!
LSQR reconstruction with minimal error, xLSQR41
Error of the best LSQR reconstruction, |xexact − xLSQR41
|
10
20
30
40
50
60
70
80
90
100
110
23
Denoising, problem SHAW(400), maximal amplification factor
2 4 6 8 10 1210
0
101
102
103
k
HF ampl
s4
s5
s6
s7
s8
s9
s10
24
Denoising?
−2
−1
0
1
2x 10−3
−2
−1
0
1
2x 10−3
original noise transformed noise
Subtraction of the approximate noise from the data leads to
the remaining much smoother “transformed noise”
25
Main message :
Whenever you see a blurred elephant which is a bit too noisy,
the best thing is to apply, as quickly as possible,
the GK iterative bidiagonalization.
26
References
• Golub, Kahan: Calculating the singular values and pseudoinverse of a matrix,SIAM J. B2, 1965.
• Hansen: Rank-deficient and discrete ill-posed problems, SIAM MonographsMath. Modeling Comp., 1998.
• Hansen, Kilmer, Kjeldsen: Exploiting residual information in the parameter
choice for discrete ill-posed problems, BIT, 2006.
• Meurant, Strakos: The Lanczos and CG algorithms in finite precision arith-
metic, Acta Numerica, 2006.
• Hnetynkova, Strakos: Lanczos tridiag. and core problem, LAA, 2007.
• Hnetynkova, Plesinger, Strakos: The regularizing effect of the Golub-Kahan
iterative bidiagonalization and revealing the noise level, BIT, 2009.• Michenkova: Regularization techniques based on the least squares method,
diploma thesis, MFF UK, 2013.
• Liesen, Strakos: Krylov Subspace Methods, Principles and Analysis. OxfordUniversity Press, 2013.
• ...
27
Thank you for your kind attention!
28