lecture 8: fast linear solvers (part 5)zxu2/acms60212-40212-s12/lec-09-5.pdfΒ Β· lecture 8: fast...
TRANSCRIPT
Lecture 8: Fast Linear Solvers (Part 5)
1
Conjugate Gradient (CG) Method
β’ Solve π΄π = π with π΄ being an π Γ π symmetric positive definite matrix.
β’ Define the quadratic function
π π =1
2πππ΄π β πππ
Suppose π minimizes π π , π is the solution to π΄π = π.
β’ π»π π = ππ
ππ₯1, β¦ ,
ππ
ππ₯π= π΄π β π
β’ The iteration takes form π(π+1) = π π + πΌππ(π) where π(π) is the search direction and πΌπ is the step size.
β’ Define π π = π β π΄π π to be the residual vector. 2
β’ Let π and π β π π π + πΌπ be fixed vectors and πΌ a real number variable.
Define:
β πΌ = π π + πΌπ = π π + 2πΌ < π, π΄π β π > +πΌ2 < π, π΄π >
β πΌ has a minimum when ββ²(πΌ) = 0. This occurs when
πΌ =ππ(π β π΄π)
ππ»π΄π.
So β πΌ = π π β(ππ(πβπ΄π))2
ππ»π΄π.
Suppose πβ is a vector that minimizes π π . So π π + πΌ π β₯ π πβ . This implies ππ π β π΄πβ = 0. Therefore π β π΄πβ = 0.
3
β’ For any π β π, π π + πΌπ < π π unless
ππ π β π΄π = 0 with πΌ =ππ(πβπ΄π)
ππ»π΄π.
β’ How to choose the search direction π? β Method of steepest descent: π = βπ»π π
β’ Remark: Slow convergence for linear systems
Algorithm. Let π(0) be initial guess. for π = 1,2, β¦ π(π) = π β π΄π(πβ1)
πΌπ =<π π , πβπ΄π(πβ1) >
<π(π),π΄π(π)>
π(π) = π(πβ1) + πΌππ(π) end
4
Steepest descent method when ππππ₯
ππππ is large
β’ Consider to solve π΄π = π with π΄ =π1 00 π2
,
π =π1
π2 and the start vector π =
β9β1
.
Reduction of ||π΄π(π) β π||2 < 10β4.
β With π1 = 1, π2 = 2, it takes about 10 iterations
β With π1 = 1, π2 = 10, it takes about 40 iterations
5
β’ Second approach to choose the search direction π?
β A-orthogonal approach: use a set of nonzero direction
vectors {π 1 , β¦ , π(π)} that satisfy < π(π), π΄π(π) > = 0, if
π β π. The set {π 1 , β¦ , π(π)} is called A-orthogonal.
β’ Theorem. Let {π 1 , β¦ , π(π)} be an A-orthogonal set of nonzero vectors associated with the
symmetric, positive definite matrix π΄, and let π(0)
be arbitrary. Define πΌπ =<π π , πβπ΄π(πβ1) >
<π(π),π΄π(π)> and
π π = π πβ1 + πΌππ π for π = 1,2 β¦ π. Then
π΄π π = π when arithmetic is exact.
6
Conjugate Gradient Method
β’ The conjugate gradient method of Hestenes and Stiefel.
β’ Main idea: Construct {π 1 , π 2 β¦ } during
iteration so that the residual vectors π π are
mutually orthogonal.
7
Algorithm of CG Method
Let π(0) be initial guess. Set π(0) = π β π΄π(0); π(1) = π(0). for π = 1,2,β¦
πΌπ =<π πβ1 ,π πβ1 >
<π(π),π΄π(π)>
π(π) = π(πβ1) + πΌππ(π) π(π) = π(πβ1) β πΌππ΄π(π) // construct residual
ππ =< π π , π π > if ππ < π exit. //convergence test
π π =<π π ,π π >
<π(πβ1),π(πβ1)>
π(π+1) = π(π) + π ππ(π) // construct new search direction end
8
Remarks
β’ Constructed {π 1 , π 2 β¦ } are pair-wise A-orthogonal.
β’ Each iteration, there are one matrix-vector multiplication, two dot products and three scalar multiplications.
β’ Due to round-off errors, in practice, we need more than π iterations to get the solution.
β’ If the matrix π΄ is ill-conditioned, the CG method is sensitive to round-off errors (CG is not good as Gaussian elimination with pivoting).
β’ Main usage of CG is as iterative method applied to bettered conditioned system.
9
CG as Krylov Subspace Method
Theorem. π(π) of the CG method minimizes the function π π with respect to the subspace
Ξπ π΄, π 0 =
π πππ{π 0 , π΄π 0 , π΄2π 0 , β¦ , π΄πβ1π 0 }.
I.e.
π π(π) = ππππππ π(0) + πππ΄
ππ 0πβ1π=0
The subspace Ξπ π΄, π 0 is called Krylov subspace.
10
Error Estimate
β’ Define an energy norm || β ||π΄ of vector π with respect to matrix π΄: ||π||π΄ = (πππ΄π)1/2
β’ Define the error π(π) = π(π) β πβ where πβ is the exact solution.
β’ Theorem.
||π(π) β πβ||π΄ β€ 2(π π΄ β1
π π΄ +1)π||π(0) β πβ||π΄ with
π π΄ = ππππ π΄ =ππππ₯(π΄)
ππππ(π΄)β₯ 1.
Remark: Convergence is fast if matrix π΄ is well-conditioned.
11
Preconditioning
Let the symmetric positive definite matrix π be a preconditioner for π΄ and πΏπΏπ = π be its Cholesky factorization. πβ1π΄ is better conditioned than π΄. The preconditioned system of equations is
πβ1π΄π = πβ1π or
πΏβππΏβ1π΄π = πΏβππΏβ1π where πΏβπ = (πΏπ)β1. Multiply with πΏπ to obtain
πΏβ1π΄πΏβππΏππ = πΏβ1π
Define: π΄ = πΏβ1π΄πΏβπ; π = πΏππ; π = πΏβ1π
Now apply CG to π΄ π = π .
12
Preconditioned CG Method
β’ Define π(π) = πβ1π(π) to be the preconditioned residual.
Let π(0) be initial guess. Set π(0) = π β π΄π(0); Solve ππ(0) = π(0) for π(0) Set π(1) = π(0) for π = 1,2, β¦
πΌπ =<π πβ1 ,π πβ1 >
<π(π),π΄π(π)>
π(π) = π(πβ1) + πΌππ(π)
π(π) = π(πβ1) β πΌππ΄π(π)
solve ππ(π) = π(π) for π(π)
ππ =< π π , π π > if ππ < π exit. //convergence test
π π =<π π ,π π >
<π(πβ1),π(πβ1)>
π(π+1) = π(π) + π ππ(π) end 13
Incomplete Cholesky Factorization
β’ Assume π΄ is symmetric and positive definite. π΄ is sparse. β’ Factor π΄ = πΏπΏπ + π , π β π. πΏ has similar sparse structure as
π΄.
14
for π = 1, β¦ , π πππ = πππ for π = π + 1, β¦ , π
πππ =πππ
πππ
for π = π + 1, β¦ , π if πππ = 0 then
πππ = 0
else πππ = πππ β ππππππ
endif endfor endfor endfor
In diagonal or Jacobi preconditioning π = ππππ(π΄) β’ Jacobi preconditioning is cheap if it works, i.e.
solving ππ(π) = π(π) for π(π) almost cost nothing. References β’ CT. Kelley. Iterative Methods for Linear and Nonlinear Equations β’ T. F. Chan and H. A. van der Vorst, Approximate and incomplete factorizations, D.
E. Keyes, A. Sameh, and V. Venkatakrishnan, eds., Parallel Numerical Algorithms, pp. 167-202, Kluwer, 1997
β’ M. J. Grote and T. Huckle, Parallel preconditioning with sparse approximate inverses, SIAM J. Sci. Comput. 18:838-853, 1997
β’ Y. Saad, Highly parallel preconditioners for general sparse matrices, G. Golub, A. Greenbaum, and M. Luskin, eds., Recent Advances in Iterative Methods, pp. 165-199, Springer-Verlag, 1994
β’ H. A. van der Vorst, High performance preconditioning, SIAM J. Sci. Stat. Comput. 10:1174-1185, 1989
Jacobi Preconditioning
15
Row-wise Block Striped Decomposition of a Symmetrically Banded Matrix
Row decomposition 16
Parallel CG Algorithm β’ Assume a row-wise block-striped decomposition of matrix π΄ and partition all vectors
uniformly among tasks.
Let π(0) be initial guess. Set π(0) = π β π΄π(0); Solve ππ(0) = π(0) for π(0) Set π(1) = π(0) for π = 1,2, β¦ π = π΄π(π) // parallel matrix-vector multiplication
π§π =< π πβ1 , π πβ1 > // parallel dot product by MPI_Allreduce
πΌπ =π§π
<π(π),π> // parallel dot product by MPI_Allreduce
π(π) = π(πβ1) + πΌππ(π) // π(π) = π(πβ1) β πΌππ //
solve ππ(π) = π(π) for π(π) // Solve matrix system, can involve additional complexity
ππ =< π π , π π > // MPI_Allreduce if ππ < π exit. //convergence test
π§π_π =< π π , π π > // parallel dot product
π π =π§π_π
π§π
π(π+1) = π(π) + π ππ(π) end
17