Lecture 1718.086
Krylov subspaces• There is an integer n such that
586 Chapter 7 Solving Large Systems
7.4 KRYLOV SUBSPACES A N D CONJUGATE GRADIENTS
Our original equation is Ax = b. The preconditioned equation is P-I Ax = P-'b. When we write P-l, we never intend that an inverse will be explicitly computed. P may come from Incomplete LU, or a few steps of a multigrid iteration, or "domain decomposition." Entirely new preconditioners are waiting to be invented.
The residual is r k = b - Axk. This is the error in Ax = b, not the error in x. An ordinary preconditioned iteration corrects xk by the vector P-'rk:
In describing Krylov subspaces, I should work with P-'A. For simplicity I will only wr i te A! I am assuming that P has been chosen and used, and the precondi- tioned equation P-'Ax = P- 'b is given the notation Ax = b. The preconditioner is now P = I . Our new A is probably better than the original matrix with that name.
With xl = b, look first at two steps of the pure iteration x,+l = (I - A ) x ~ + b:
My point is simple but important: xj is a combination of b, Ab, . . . , Ai-lb. We can compute those vectors quickly, multiplying at each step by a sparse A. Every iteration involves only one matrix-vector multiplication. Krylov gave a name to a l l combinations of those vectors b, . . . , Aj-'b, and he suggested that there might be better combinations (closer to x = A-'b) than the particular choices x j in (2).
Krylov Subspaces
The linear combinations of b, Ab, . . . , AJplb form the j th Krylov subspace. This space depends on A and b. Following convention, I will write K j for that subspace and K j for the matrix with those basis vectors in its columns:
Krylov matrix K j = [ b Ab A2b ... A j - l b ] .
Krylov subspace 1Cj = a l l combinations of b, Ab, . . . , Aj-l b. (3)
Thus ICj is the column space of K j . We want to choose the best combination as our improved x,. Various definitions of "best" will give various x,. Here are four different approaches to choosing a good x j in &-this is the important decision:
1. The residual r, = b - Axj is orthogonal to ICj (Conjugate Gradients).
2. The residual r, has minimum norm for x, in K j (GMRES and MINRES).
3. r, is orthogonal to a different space ICj(AT) (BiConjugate Gradients).
4. The error e, has minimum norm in ICj (SYMMLQ).
Kj ⇢ Kn = Kn+1 = Kn+2 = ...
• n is the maximum dimension that the Krylov subspaces can have. • If A is NxN, then n N• Ax⇤ = b ) x⇤ 2 Kn
• Krylov: Good approximations to Ax=b can be found in spaces j<n• Idea: Find optimal x in j-th Krylov space, then iteratively go to
higher Krylov spaces and improve x• Optimal x can be defined in terms of residual, e.g.
• Optimize x such that (conjugate gradient, A sym., pos. def.)
• Optimize x such that is minimal (GMRES, MINRES algorithm)
rk ? Kj
|rk|
Arnoldi orthogonalization
see lecture / book
=> Need a good basis of Krylov spaces: 1. orthonormal 2. built iteratively
=> Arnoldi orthogonalization
7.4 Krylov Subspaces and Conjugate Gradients 589
Arnoldi's orthogonalization of b, Ab, . . . , A"-1 b:
0 ql = blllbll; f o r j = 1, . . . , n - 1
1 t = Aqj; f o r i = 1, . . . , j
2 T h . . = q. t; 23 2
3 t = t - hijqi; end
4 hj+l,j = Iltll; 5 qj+l = t/hj+l,j;
end
% Normalize b to 1 1 ql 1 1 = 1 % Start computation of qj+l % one matrix multiplication % t is in the space Kj+l % hijqi = projection o f t on qi % Subtract that projection % t is orthogonal to ql, . . . , qj % Compute the length of t % Normalize t to llqj+lll = 1 % ql, . . . , qn are orthonormal
hll = 512 t = Aqi - (5/2)qi t = [ - 3 -1 1 3 ] ' / 4 h21 = &/2 q2 = [-3 - 1 1 3 ] ' / 0 basis for Krylov space
You might like to see the four orthonormal vectors in the Vandermonde example. Those columns ql, q2, q3, q4 of Q are still constant, linear, quadratic, and cubic. I can also display the matrix H of numbers hij that produced the q's from the Krylov vectors b, Ab, A2b, A3b. (Since Arnoldi stops at j = n - 1, the last column of H is not actually computed. It comes from a final command H ( : , n) = Q ' * A * Q( : , n) .)
H turns out to be symmetric and trzdzagonal, when AT = A (as here)
Arnoldi's method for the Vandermonde example V gives Q and H :
1 -3 1 -1 512 J S / 2 Basis in Q &/2 512 a Multipliers hij Q = [ i - : - : 1 -1 -3 " = [ 512 ,
-- -- a 512 2 2 m
Please notice that H is not upper triangular as in Gram-Schmidt. The usual QR factorization of the original Krylov matrix K (which is V in our example) has this same Q, but Arnoldi's Q H is different from K = QR. The vector t that Arnoldi orthogonalizes against all the previous ql, . . . , qj is t = Aqj. This is not column j + 1 of K , as in Gram-Schmidt. Arnoldi is factoring AQ !
Arnoldi factorization AQ = Q H for the final subspace Kn:
This matrix H is upper triangular plus one lower diagonal, which makes it "upper Hessenberg." The hij in step 2 go down column j as far as the diagonal. Then hj+l,j in step 4 is below the diagonal. We check that the first column of AQ = Q H (multiplying by columns) is Arnoldi's first cycle that produces q2:
Column 1 = hllqi + h21q2 which is q2 = (Aql - hllql)/hzl . (9)
CG algorithm• Basic idea: Iterative method with finds optimal xk at each step k
with xk 2 Kk
• CG: Optimal <=> rk ? Kk
• Again: rk ? Kk ) r>k ri for i < k
• In particular: with q’s an orthonormal basis, rk = ±|rk|qk+1
Kj = span(q1, . . . , qj)
• Important requirement: A symmetric, positive definite!