parallel numerics, wt 2012/2013€¦ · linear systems of equations with dense matricesge in...

35
Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices page 1 of 35

Upload: others

Post on 08-Oct-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Parallel Numerics, WT 2012/2013

3 Linear Systems of Equations with DenseMatrices

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 1 of 35

Page 2: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Contents1 Introduction

1.1 Computer Science Aspects1.2 Numerical Problems1.3 Graphs1.4 Loop Manipulations

2 Elementary Linear Algebra Problems2.1 BLAS: Basic Linear Algebra Subroutines2.2 Matrix-Vector Operations2.3 Matrix-Matrix-Product

3 Linear Systems of Equations with Dense Matrices3.1 Gaussian Elimination3.2 Parallelization3.3 QR-Decomposition with Householder matrices

4 Sparse Matrices4.1 General Properties, Storage4.2 Sparse Matrices and Graphs4.3 Reordering4.4 Gaussian Elimination for Sparse Matrices

5 Iterative Methods for Sparse Matrices5.1 Stationary Methods5.2 Nonstationary Methods5.3 Preconditioning

6 Domain Decomposition6.1 Overlapping Domain Decomposition6.2 Non-overlapping Domain Decomposition6.3 Schur Complements

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 2 of 35

Page 3: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

3.1. Linear Systems of Equations with DenseMatrices

3.1.1. Gaussian Elimination: Basic Properties• Linear system of equations:

a11x1 + . . .+ a1nxn = b1

......

an1x1 + . . .+ annxn = bn

• Solve Ax = b a11 · · · a1n...

. . ....

an1 · · · ann

x1

...xn

=

b1...

bn

• Generate simpler linear equations (matrices). Transform A in

triangular form: A = A(1) → A(2) → . . .→ A(n) = U.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 3 of 35

Page 4: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Transformation to Upper Triangular Forma11 a12 · · · a1na21 a22 · · · a2n...

.... . .

...an1 an2 · · · ann

row transformations: (2)→ (2)− a21

a11· (1), . . . , (n)→ (n)− an1

a11· (1)

leads to

A(2) =

a11 a12 a13 · · · a1n

0 a(2)22 a(2)

23 · · · a(2)2n

0 a(2)32 a(2)

33 · · · a(2)3n

......

.... . .

...0 a(2)

n2 a(2)n3 · · · a(2)

nn

next transformations: (3)→ (3)− a(2)

32

a(2)22

· (2), . . . , (n)→ (n)− a(2)n2

a(2)22

· (2)

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 4 of 35

Page 5: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Transformation to Triangular Form (cont.)

A(3) =

a11 a12 a13 · · · a1n

0 a(2)22 a(2)

23 · · · a(2)2n

0 0 a(3)33 · · · a(3)

3n...

......

. . ....

0 0 a(3)n3 · · · a(3)

nn

next transformations: (4)→ (4)− a(3)

43

a(3)33

· (3), . . . , (n)→ (n)− a(3)n3

a(3)33

· (3)

A(n) =

a11 a12 a13 · · · a1n

0 a(2)22 a(2)

23 · · · a(2)2n

0 0 a(3)33 · · · a(3)

3n...

......

. . ....

0 0 0 · · · a(n)nn

= U

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 5 of 35

Page 6: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Pseudocode Gaussian Elimination (GE)Simplification: assume that no pivoting is necessary.

a(k)kk 6= 0 or |a(k)

kk | ≥ ρ > 0 for k = 1,2, . . . ,n

for k = 1 : n − 1...for i = k + 1 : n......li,k =

ai,kak,k

...end

...for i = k + 1 : n

......for j = k + 1 : n

.........ai,j = ai,j − li,k · ak,j

......end

...end

end

In practice:• Include pivoting and include right hand side b.• There is still to solve a triangular system in U!

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 6 of 35

Page 7: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Intermediate Systems

A(k), k = 1,2, . . . ,n with A = A(1) and U = A(n)

a(1)11 · · · a(1)

1,k−1 a(1)1,k · · · a(1)

1,n

0. . .

......

. . ....

.... . . a(k−1)

k−1,k−1 a(k−1)k−1,k · · · a(k−1)

k−1,n

0 · · · 0 a(k)k,k · · · a(k)

k,n...

. . ....

.... . .

...0 · · · 0 a(k)

n,k · · · a(k)n,n

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 7 of 35

Page 8: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Define Auxiliary Matrices

L =

1 0 · · · 0

l2,1 1. . . 0

.... . . . . .

...ln,1 · · · ln,n−1 1

and U = A(n)

Lk :=

0 · · · 0 0 0 · · · 0...

. . ....

......

. . ....

0 · · · 0 0 0 · · · 00 · · · 0 0 0 · · · 00 · · · 0 lk+1,k 0 · · · 0...

. . ....

......

. . ....

0 · · · 0 ln,k 0 · · · 0

, L = I +

∑k

Lk

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 8 of 35

Page 9: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Elimination Step in Terms of AuxiliaryMatrices

A(k+1) = (I − Lk ) · A(k) = A(k) − Lk · A(k)

U = A(n) = (I − Ln−1) · A(n−1) = . . . = (I − Ln−1) · · · (I − L1)A(1) = L · A

L := (I − Ln−1) · · · (I − L1)

A = L−1 · U with U upper triangular and L lower triangular

• Theorem 2: L−1 = L and therefore A = LU.

• Advantage: Every further problem Ax = bj can be reduced to(LU)x = bj for arbitrary j .

• Solve two triangular problems (LU)x = Ly = b and Ux = y .

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 9 of 35

Page 10: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Theorem 2: L−1 = L → A = LU

for i ≤ j : Li · Lj =

(I + Lj)(I − Lj) = I + Lj − Lj − L2j = I ⇒ (I − Lj)

−1 = I + Lj

(I + Li)(I + Lj) = I + Li + Lj + LiLj = I + Li + Lj︸ ︷︷ ︸L−1 = [(I − Ln−1) · · · (I − L1)]

−1 = (I − L1)−1 · · · (I − Ln−1)

−1 =

(I + L1)(I + L2) · · · (I + Ln−1) = I + L1 + L2 + · · ·+ Ln−1 = L

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 10 of 35

Page 11: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

3.2. GE in Parallel: Blockwise

Main idea: Blocking of GE to avoid data transfer between processors.

Basic Concepts:

Replace GE or large LU-decomposition of full matrix by smallintermediate steps (by sequence of small block operations):• Solving collection of small triangular systems LUk = Bk

(parallelism in columns of U)

• A→ A− LU updating matrices (also easy to parallelize)

• small B = LU-decompositions (parallelism in rows of B)

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 11 of 35

Page 12: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

How to Choose Blocks in L/U SatisfyingLU = AL11 0 0

L21 L22 0L31 L32 L33

U11 U12 U130 U22 U230 0 U33

=

A11 A12 A13A21 A22 A23A31 A32 A33

=

=

L11U11 L11U12 L11U13L21U11 L21U12 + L22U22 L21U13 + L22U23L31U11 L31U12 + L32U22 ∗

Different ways of computing L and U depending on• start (assume first entry/row/column of L/U as given)• how to compute new entry/row/column of L/U• update of block structure of L/U by grouping in

– known blocks– blocks newly to compute– blocks to be computed later

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 12 of 35

Page 13: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Crout Form

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 13 of 35

Page 14: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Crout Form (cont.)1. Solve

by small LU-decomposition of the modified part of A→ L22,L32,and U22.

2. Solve

by solving small triangular systems of equations in L22 → U23.

Initial steps:

L11U11 = A11,

(L21L31

)U11 =

(A21A31

), L11(U12 U13) = (A12 A13)

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 14 of 35

Page 15: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

New Partitioning

• Combine already computed parts from second column of L andsecond row of U into first column of L and first row of U.

• Split the until now ignored parts L33 and U33 into newcolumns/rows.

• Repeat this overall procedure until L and U are fully computed.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 15 of 35

Page 16: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Block StructureIntermediate block structure:

Solve for red blocks.

Reconfigure the block structure:

Repeat until done.Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 16 of 35

Page 17: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Left Looking GE

• Solve L11U12 = A12 by a couple of parallel triangular solves and(L22L32

)U22 =

(A22A32

)−(

L21L31

)U12 =:

(A22

A32

)update part of A and perform small LU-decompostion.

• Reorder blocks and repeat until ready. Start: L11U11 = A11,L21U11 = A21, and L31U11 = A31.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 17 of 35

Page 18: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Block StructureIntermediate block structure:

Solve for red blocks.

Reconfigure the block structure:

Repeat until done.Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 18 of 35

Page 19: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Right Looking GENew blocking:

• Start with L11U11 = A11 (small LU-decomposition).

• Equations L21U11 = A21 and L11U12 = A12 by triangular solvesgives L21 and U12.

• It remains L22U22 = A22 − L21U12 = A22

• To compute the LU-decomposition of modified A22 repeat2× 2-blocking for A22 and apply recursively.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 19 of 35

Page 20: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Block StructureIntermediate block structure:

Solve for blue and both red blocks.

Reconfigure the block structure:

Repeat until done.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 20 of 35

Page 21: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Comparison and Overview

• In comparison, all methods

– have nearly same efficiency in parallel– but better performance (in sequential or parallel) than the

unblocked variants because they are based on BLAS-3.

• Elementary steps of all blocking methods:

– Matrix-Matrix product and sum (easy to parallelize)– Couple of triangular solves (easy to parallelize)– Small LU-decomposition (parallelizable for long rows)

• Crout and right looking slightly better because more flops inmatrix-updates and less triangular solves respectivelyLU-decompositions.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 21 of 35

Page 22: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

3.3. QR-Decomposition with HouseholderMatrices

3.3.1. QR-decomposition

• Gaussian elimination→ LU-decomposition: sometimesnumerically not stable, over/underdetermined systems

• Improvement:

QR-decomposition A = QR with Q orthogonal, R triangular,Solve linear system Ax=b numerically stable via

b = Ax = QRx ⇔ Rx = QT b

by cheap matrix-vector multiplication and triangular solve.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 22 of 35

Page 23: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Overdetermined Systems

• Ax = b with

A being m × n matrix, n� mx vector of length nb vector of length m

• Best approximate solution by solving minimization

minx‖Ax − b‖2

2 = minx

(xT AT Ax − 2xT AT b + bT b)

• Gradient equal zero⇔ AT Ax = AT b (normal equations)

• Solution by considering linear system AT A, but condition numberworse:

cond(AT A) = cond(A)2

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 23 of 35

Page 24: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Advantage of QR-Decomposition

A = QR, R =

(R10

), cond(R1) = cond(A), b = QT b =

(b1

b2

)

AT Ax = AT b ⇔ (QR)T (QR)x = (QR)T b ⇔

RT Rx = RT (QT b) ⇔ (RT1 0)

(R10

)x = (RT

1 0)b ⇔

RT1 R1x = (RT

1 0)

(b1

b2

)⇔ RT

1 R1x = RT1 b1 ⇔ R1x = b1

• Instead of solving the normal equations we only have to considerthe triangular system in R1 .

• Cheap and better condition number.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 24 of 35

Page 25: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

3.3.2. Householder Method

• Define special orthogonal and simple matrices H calledHouseholder matrices (compare Givens):

u ∈ Rn, ‖u‖2 = 1 : H = I − 2uuT

• H as rank-1 perturbation of the identity is symmetric, idempotentand orthogonal:

HT = I − 2uuT = H

HT H = H2 = (I−2uuT )(I−2uuT ) = I−2uuT−2uuT+4u uT u︸︷︷︸= 1

uT = I

• For complex problems:orthogonal→ unitary, symmetric→ hermitian

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 25 of 35

Page 26: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Householder Method (cont.)

• Use H1 with appropriate vector u1 to eliminate first column of A

H1A = (I−2u1uT1 )(a1 · · · am) = (a1−2(uT

1 a1)u1 · · · ∗) =

α ∗0 ∗...

...0 ∗

• To satisfy this equation we have to find a vector u1 of length 1

witha1 − 2(uT

1 a1)u1 = αe1

• Because H1 is orthogonal it holds:

‖H1a1‖2 = ‖a1‖2 = ‖αe1‖2 = |α| ⇒ α = ±‖a1‖2, e.g. α = ‖a1‖2

u1 =a1 − ‖a1‖2e1

2(uT1 a1)

=a1 − ‖a1‖2e1

‖a1 − ‖a1‖2e1‖2

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 26 of 35

Page 27: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Householder Method (cont. 2)• Repeat for all columns of A

H1A = H1A1 = (I − 2u1uT1 )A =

‖a1‖2 ∗ · · · ∗

0... A2

0

• Apply the same procedure on A2, (n − 1)× (m − 1) matrix.

H2A2 = (I − 2u2uT2 )A2 =

α2 ∗ · · · ∗0... A3

0

• Extend

u2 :=

(0u2

), H2 := I − 2u2uT

2 =

1 0 · · · 00... H2

0

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 27 of 35

Page 28: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Householder Method (cont. 3)

• For column 1,2, . . . ,m this gives Householder matricesH1, . . . ,Hm with

Hm · · ·H2H1︸ ︷︷ ︸= QT

·A = Hm · · ·H3·

α1 ∗ ∗ · · · ∗0 α2 ∗ · · · ∗0 0...

... A30 0

=

(R10

)=: R

• Hence:

A = QR, Q := (Hm · · ·H2H1)T = H1H2 · · ·Hm

• Remark: for m = n: H1, . . . ,Hm−1 is enough, because lastcolumn is scalar.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 28 of 35

Page 29: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

3.3.3. Householder Method in Parallel - Blockwise

Idea: work again blockwise.

• In a first step compute u1 and the application of H1 on the first kcolumns of A. Do not compute H1A fully!

• Then compute u2, . . . ,uk and the application of H1 . . .Hk on thefirst columns of A.

Hk · · ·H1(A1 A2) = (Hk · · ·H1A1 (Hk · · ·H1)A2) = (A(k)1 VA2)

• Still to compute: VA2.

• How can we take advantage of parallelism in this computation?→ represent V in special form that allows fast and parallelevaluation of VA2.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 29 of 35

Page 30: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Property of Householder matrices

Theorem 3: For Householder matrices Hk , . . . ,Hi it holds

Hk · · ·Hi = (I − 2uk uTk ) · · · (I − 2uiuT

i ) = I − (uk · · · ui)︸ ︷︷ ︸=: Y

Ti

uTk...

uTi

with Ti being upper triangular.

Proof by induction:

Representation obviously fulfilled for one Householder mtx i = k .

Assume, representation holds for Hk , . . . ,Hi . Then . . . (next slide)

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 30 of 35

Page 31: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Property of Householder matrices (cont.)[(I − 2uk uT

k ) · · · (I − 2uiuTi )](I − 2ui−1uT

i−1) =

=

I − (uk · · · ui)Ti

uTk...

uTi

· (I − 2ui−1uT

i−1) =

= I − 2ui−1uTi−1 − (uk · · · ui)Ti

uTk...

uTi

+ 2(uk · · · ui)Ti

uTk ui−1

...uT

i ui−1

︸ ︷︷ ︸

=: y

uTi−1 =

= I − (uk · · · ui ui−1) ·(

Ti −2y0 2

uT

k...

uTi

uTi−1

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 31 of 35

Page 32: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Algorithm for parallel Householder

Computation of Hk · · ·HiA = VkiA = (I − YTY T )A in the form

VkiA = Vki(A1 A2) = (∗ VkiA2)

andVkiA2 = (I − YTY T )A2 = A2 − Y [T (Y T A2)]

Algorithm:• Compute u1 and H1A1; u2 and H2A1; . . . ; uk and Hk A1

(sequential)

• Compute Y and VA2 (parallel)

• Repeat with indices k + 1, . . . ,2k ; 2k + 1, . . . ,3k ; . . .

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 32 of 35

Page 33: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Communication Avoiding QR

• Four independent QR-factorisations

• Two independent reduced QR-factorisations

• One reduced QR-factorisation

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 33 of 35

Page 34: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Tall Skinny QR

A =

A0A1A2A3

=

Q0R0Q1R1Q2R2Q3R3

=

Q0

Q1Q2

Q3

R0R1R2R3

R0R1R2R3

=

(

R0R1

)(

R2R3

) =

(Q01R01Q23R23

)=

(Q01

Q23

)(R01R23

)

(R01R23

)= Q0123R0123

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 34 of 35

Page 35: Parallel Numerics, WT 2012/2013€¦ · Linear Systems of Equations with Dense MatricesGE in Parallel: Blockwise QR-Decomposition Contents 1 Introduction 1.1 Computer Science Aspects

Linear Systems of Equations with Dense Matrices GE in Parallel: Blockwise QR-Decomposition

Tall Skinny QR (cont.)

A =

A0A1A2A3

=

Q0Q1

Q2Q3

· (Q01Q23

)·Q0123

· R0123

Advantage:Messages in O(log(P)) compared to O(2n log(P)) for ScaLAPACK.

Parallel Numerics, WT 2012/2013 3 Linear Systems of Equations with Dense Matrices

page 35 of 35