lecture 17 - mathematicsmath.mit.edu/~stoopn/18.086/lecture17.pdflecture 17 18.086. krylov subspaces...

4
Lecture 17 18.086

Upload: others

Post on 07-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lecture 17 - Mathematicsmath.mit.edu/~stoopn/18.086/Lecture17.pdfLecture 17 18.086. Krylov subspaces ... see lecture / book => Need a good basis of Krylov spaces: 1. orthonormal 2

Lecture 1718.086

Page 2: Lecture 17 - Mathematicsmath.mit.edu/~stoopn/18.086/Lecture17.pdfLecture 17 18.086. Krylov subspaces ... see lecture / book => Need a good basis of Krylov spaces: 1. orthonormal 2

Krylov subspaces• There is an integer n such that

586 Chapter 7 Solving Large Systems

7.4 KRYLOV SUBSPACES A N D CONJUGATE GRADIENTS

Our original equation is Ax = b. The preconditioned equation is P-I Ax = P-'b. When we write P-l, we never intend that an inverse will be explicitly computed. P may come from Incomplete LU, or a few steps of a multigrid iteration, or "domain decomposition." Entirely new preconditioners are waiting to be invented.

The residual is r k = b - Axk. This is the error in Ax = b, not the error in x. An ordinary preconditioned iteration corrects xk by the vector P-'rk:

In describing Krylov subspaces, I should work with P-'A. For simplicity I will only wr i te A! I am assuming that P has been chosen and used, and the precondi- tioned equation P-'Ax = P- 'b is given the notation Ax = b. The preconditioner is now P = I . Our new A is probably better than the original matrix with that name.

With xl = b, look first at two steps of the pure iteration x,+l = (I - A ) x ~ + b:

My point is simple but important: xj is a combination of b, Ab, . . . , Ai-lb. We can compute those vectors quickly, multiplying at each step by a sparse A. Every iteration involves only one matrix-vector multiplication. Krylov gave a name to a l l combinations of those vectors b, . . . , Aj-'b, and he suggested that there might be better combinations (closer to x = A-'b) than the particular choices x j in (2).

Krylov Subspaces

The linear combinations of b, Ab, . . . , AJplb form the j th Krylov subspace. This space depends on A and b. Following convention, I will write K j for that subspace and K j for the matrix with those basis vectors in its columns:

Krylov matrix K j = [ b Ab A2b ... A j - l b ] .

Krylov subspace 1Cj = a l l combinations of b, Ab, . . . , Aj-l b. (3)

Thus ICj is the column space of K j . We want to choose the best combination as our improved x,. Various definitions of "best" will give various x,. Here are four different approaches to choosing a good x j in &-this is the important decision:

1. The residual r, = b - Axj is orthogonal to ICj (Conjugate Gradients).

2. The residual r, has minimum norm for x, in K j (GMRES and MINRES).

3. r, is orthogonal to a different space ICj(AT) (BiConjugate Gradients).

4. The error e, has minimum norm in ICj (SYMMLQ).

Kj ⇢ Kn = Kn+1 = Kn+2 = ...

• n is the maximum dimension that the Krylov subspaces can have. • If A is NxN, then n N• Ax⇤ = b ) x⇤ 2 Kn

• Krylov: Good approximations to Ax=b can be found in spaces j<n• Idea: Find optimal x in j-th Krylov space, then iteratively go to

higher Krylov spaces and improve x• Optimal x can be defined in terms of residual, e.g.

• Optimize x such that (conjugate gradient, A sym., pos. def.)

• Optimize x such that is minimal (GMRES, MINRES algorithm)

rk ? Kj

|rk|

Page 3: Lecture 17 - Mathematicsmath.mit.edu/~stoopn/18.086/Lecture17.pdfLecture 17 18.086. Krylov subspaces ... see lecture / book => Need a good basis of Krylov spaces: 1. orthonormal 2

Arnoldi orthogonalization

see lecture / book

=> Need a good basis of Krylov spaces: 1. orthonormal 2. built iteratively

=> Arnoldi orthogonalization

7.4 Krylov Subspaces and Conjugate Gradients 589

Arnoldi's orthogonalization of b, Ab, . . . , A"-1 b:

0 ql = blllbll; f o r j = 1, . . . , n - 1

1 t = Aqj; f o r i = 1, . . . , j

2 T h . . = q. t; 23 2

3 t = t - hijqi; end

4 hj+l,j = Iltll; 5 qj+l = t/hj+l,j;

end

% Normalize b to 1 1 ql 1 1 = 1 % Start computation of qj+l % one matrix multiplication % t is in the space Kj+l % hijqi = projection o f t on qi % Subtract that projection % t is orthogonal to ql, . . . , qj % Compute the length of t % Normalize t to llqj+lll = 1 % ql, . . . , qn are orthonormal

hll = 512 t = Aqi - (5/2)qi t = [ - 3 -1 1 3 ] ' / 4 h21 = &/2 q2 = [-3 - 1 1 3 ] ' / 0 basis for Krylov space

You might like to see the four orthonormal vectors in the Vandermonde example. Those columns ql, q2, q3, q4 of Q are still constant, linear, quadratic, and cubic. I can also display the matrix H of numbers hij that produced the q's from the Krylov vectors b, Ab, A2b, A3b. (Since Arnoldi stops at j = n - 1, the last column of H is not actually computed. It comes from a final command H ( : , n) = Q ' * A * Q( : , n) .)

H turns out to be symmetric and trzdzagonal, when AT = A (as here)

Arnoldi's method for the Vandermonde example V gives Q and H :

1 -3 1 -1 512 J S / 2 Basis in Q &/2 512 a Multipliers hij Q = [ i - : - : 1 -1 -3 " = [ 512 ,

-- -- a 512 2 2 m

Please notice that H is not upper triangular as in Gram-Schmidt. The usual QR factorization of the original Krylov matrix K (which is V in our example) has this same Q, but Arnoldi's Q H is different from K = QR. The vector t that Arnoldi orthogonalizes against all the previous ql, . . . , qj is t = Aqj. This is not column j + 1 of K , as in Gram-Schmidt. Arnoldi is factoring AQ !

Arnoldi factorization AQ = Q H for the final subspace Kn:

This matrix H is upper triangular plus one lower diagonal, which makes it "upper Hessenberg." The hij in step 2 go down column j as far as the diagonal. Then hj+l,j in step 4 is below the diagonal. We check that the first column of AQ = Q H (multiplying by columns) is Arnoldi's first cycle that produces q2:

Column 1 = hllqi + h21q2 which is q2 = (Aql - hllql)/hzl . (9)

Page 4: Lecture 17 - Mathematicsmath.mit.edu/~stoopn/18.086/Lecture17.pdfLecture 17 18.086. Krylov subspaces ... see lecture / book => Need a good basis of Krylov spaces: 1. orthonormal 2

CG algorithm• Basic idea: Iterative method with finds optimal xk at each step k

with xk 2 Kk

• CG: Optimal <=> rk ? Kk

• Again: rk ? Kk ) r>k ri for i < k

• In particular: with q’s an orthonormal basis, rk = ±|rk|qk+1

Kj = span(q1, . . . , qj)

• Important requirement: A symmetric, positive definite!