how can we solvecalhoun/teaching/math... · the best of the 20th century: editors name top 10...

11
Matrix decompositions 1 How can we solve ? Ax = b

Upload: others

Post on 19-Apr-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Matrix decompositions

���1

How can we solve !

!

?

Ax = b

Page 2: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Linear algebra

���2

Typical linear system of equations :

5x1 � x2 + 2x3 = 7

�2x1 + 6x2 + 9x3 = 0

�7x1 + 5x2 � 3x3 = 5

The variables x1, x2, and x3 only appear as linear terms

(no powers or products).

Page 3: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Some notation

���3

2

45 �1 2 7

�2 6 9 0�7 5 �3 5

3

5Apply elementary row operations to the augmented matrix to zero out entries below the diagonal and reduce the system to an upper triangular system.

2

64

5 �1 2 7

0 285

495

145

0 185 � 1

5745

3

75 (eqn 2)�✓�25

◆(eqn 1)

(eqn 3)�✓�75

◆(eqn 1)

2

64

5 �1 2 7

0 285

495

145

0 0 � 6510

655

3

75 (eqn 3)�

✓9

14

◆(eqn 2)

“Multipliers”“Pivots”

Page 4: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

The LU decomposition

���4

If we have more than one right hand side (as is often the case)

Ax = bi, i = 1, 2, . . . ,M

We can actually store the work involved carrying out the elimination by storing the multipliers used to carry out the row operations.

Page 5: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

LU Decomposition

���5

U =

2

64

5 �1 2

0 285

495

0 0 � 6510

3

75

Store the multipliers in a lower triangular matrix :

L =

2

64

1 0 0

� 25 1 0

� 75

914 1

3

75

Page 6: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

LU Decomposition

���6

The product LU is equal to A :

L U A

It does not cost us anything to store the multipliers. But by doing so, we can now solve many systems involving the matrix A.

2

64

1 0 0

� 25 1 0

� 75

914 1

3

75

2

64

5 �1 2

0 285

495

0 0 � 6510

3

75 =

2

64

5 �1 2

�2 6 9

�7 5 �3

3

75

Page 7: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Solution procedure given LU=A

���7

How can we solve a system using the LU factorization?

Ax = b

Step 1 : Solve Ly = b Forward substitution

Step 2 : Solve Ux = y Back substitution

Step 0 : Factor A into LU Row-reduction

For each right hand side, we only need to do n2oper-

ations. The expensive part is forming the original LUdecomposition.

Page 8: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Cost of a matrix inverse

���8

To solve using the matrix inverse A�1to get x = A�1

b.

To get a column cj of the matrix A�1, we solve

Acj = ej

for each column ej of an identity matrix.

total cost ⇡ 2

3

n3+ 2n3

=

8

3

n3operations

It costs about 4 times as much to multiply by the in-

verse as it does to solve the linear system using Gaus-

sian elimination.

The cost of the matrix vector multiply A�1b is n2

.

Page 9: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Row exchanges

���9

What if we start with a system that looks like :

A =

2

64

0 0 � 6510

0

285

495

5 �1 2

3

75

All we need to do is exchange the rows of A, and do

the decomposition on

LU = PA

where P is a permutation matrix, i.e.

P =

2

40 0 1

0 1 0

1 0 0

3

5

Page 10: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Partial pivoting

���10

We can also do row exchanges not just to avoid a zero pivot, but also to make the pivot as large as possible. This is called “partial pivoting”.

2

66664

X X X X XX X X X XX X X X XX X X X XX X X X X

3

77775

Find the largest pivot in the entire column, and do a row exchange.

One can also do “full pivoting” by looking for the largest pivot in the entire matrix. But this is rarely done.

Page 11: How can we solvecalhoun/teaching/Math... · The Best of the 20th Century: Editors Name Top 10 Algorithms In terms of wide-spread use, George Dantzig’s simplex method is among the

Top 10 algorithms

���11

The matrix decompositions are listed as one of the top 10 algorithms of the 20th century

!

from SIAM News, Volume 33, Number 4

By Barry A. Cipra

Algos is the Greek word for pain. Algor is Latin, to be cold. Neither is the root for algorithm, which stems instead from al-Khwarizmi, the name of the ninth-century Arab scholar whose book al-jabr wa’l muqabalah devolved into today’s high schoolalgebra textbooks. Al-Khwarizmi stressed the importance of methodical procedures for solving problems. Were he around today,he’d no doubt be impressed by the advances in his eponymous approach.

Some of the very best algorithms of the computer age are highlighted in the January/February 2000 issue of Computing in Science& Engineering, a joint publication of the American Institute of Physics and the IEEE Computer Society. Guest editors Jack Don-garra of theUniversity of Tennessee and Oak Ridge National Laboratory and Fran-cis Sullivan of the Center for Comput-ing Sciences at the Institute forDefense Analyses put togeth-er a list they call the “Top Ten Algorithms of the Century.”

“We tried to assemble the 10 al-gorithms with the greatest influence on the development and practice of science and engineeringin the 20th century,” Dongarra and Sullivan write. As with any top-10 list, their selections—and non-selections—are bound to becontroversial, they acknowledge. When it comes to picking the algorithmic best, there seems to be no best algorithm.

Without further ado, here’s the CiSE top-10 list, in chronological order. (Dates and names associated with the algorithms should be readas first-order approximations. Most algorithms take shape over time, with many contributors.)

1946: John von Neumann, Stan Ulam, and Nick Metropolis, all at the Los Alamos Scientific Laboratory, cook up the Metropolisalgorithm, also known as the Monte Carlo method.

The Metropolis algorithm aims to obtain approximate solutions to numerical problems with unmanageably many degrees of freedomand to combinatorial problems of factorial size, by mimicking a random process. Given the digital computer’s reputation for

deterministic calculation, it’s fitting that one of its earliest applications was the generation of random numbers.

1947: George Dantzig, at the RAND Corporation, creates the simplex method for linear programming.In terms of widespread application, Dantzig’s algorithm is one of the most successful of all time: Linear

programming dominates the world of industry, where economic survival depends on the ability to optimizewithin budgetary and other constraints. (Of course, the “real” problems of industry are often nonlinear; the useof linear programming is sometimes dictated by the computational budget.) The simplex method is an elegantway of arriving at optimal answers. Although theoretically susceptible to exponential delays, the algorithmin practice is highly efficient—which in itself says something interesting about the nature of computation.

1950: Magnus Hestenes, Eduard Stiefel, and Cornelius Lanczos, all from the Institute for Numerical Analysisat the National Bureau of Standards, initiate the development of Krylov subspace iteration methods.

These algorithms address the seemingly simple task of solving equations of the form Ax = b. The catch,of course, is that A is a huge n ! n matrix, so that the algebraic answer x = b/A is not so easy to compute.(Indeed, matrix “division” is not a particularly useful concept.) Iterative methods—such as solving equations of

the form Kxi + 1 = Kxi + b – Axi with a simpler matrix K that’s ideally “close” to A—lead to the study of Krylov subspaces. Namedfor the Russian mathematician Nikolai Krylov, Krylov subspaces are spanned by powers of a matrix applied to an initial“remainder” vector r0 = b – Ax0. Lanczos found a nifty way to generate an orthogonal basis for such a subspace when the matrixis symmetric. Hestenes and Stiefel proposed an even niftier method, known as the conjugate gradient method, for systems that areboth symmetric and positive definite. Over the last 50 years, numerous researchers have improved and extended these algorithms.The current suite includes techniques for non-symmetric systems, with acronyms like GMRES and Bi-CGSTAB. (GMRES andBi-CGSTAB premiered in SIAM Journal on Scientific and Statistical Computing, in 1986 and 1992,respectively.)

1951: Alston Householder of Oak Ridge National Laboratory formalizes the decompositional approachto matrix computations.

The ability to factor matrices into triangular, diagonal, orthogonal, and other special forms has turnedout to be extremely useful. The decompositional approach has enabled software developers to produceflexible and efficient matrix packages. It also facilitates the analysis of rounding errors, one of the bigbugbears of numerical linear algebra. (In 1961, James Wilkinson of the National Physical Laboratory inLondon published a seminal paper in the Journal of the ACM, titled “Error Analysis of Direct Methodsof Matrix Inversion,” based on the LU decomposition of a matrix as a product of lower and uppertriangular factors.)

1957: John Backus leads a team at IBM in developing the Fortran optimizing compiler.The creation of Fortran may rank as the single most important event in the history of computer programming: Finally, scientists

The Best of the 20th Century: Editors Name Top 10 Algorithms

In terms of wide-spread use, GeorgeDantzig’s simplexmethod is among themost successful al-gorithms of all time.

Alston Householder

!

from SIAM News, Volume 33, Number 4

By Barry A. Cipra

Algos is the Greek word for pain. Algor is Latin, to be cold. Neither is the root for algorithm, which stems instead from al-Khwarizmi, the name of the ninth-century Arab scholar whose book al-jabr wa’l muqabalah devolved into today’s high schoolalgebra textbooks. Al-Khwarizmi stressed the importance of methodical procedures for solving problems. Were he around today,he’d no doubt be impressed by the advances in his eponymous approach.

Some of the very best algorithms of the computer age are highlighted in the January/February 2000 issue of Computing in Science& Engineering, a joint publication of the American Institute of Physics and the IEEE Computer Society. Guest editors Jack Don-garra of theUniversity of Tennessee and Oak Ridge National Laboratory and Fran-cis Sullivan of the Center for Comput-ing Sciences at the Institute forDefense Analyses put togeth-er a list they call the “Top Ten Algorithms of the Century.”

“We tried to assemble the 10 al-gorithms with the greatest influence on the development and practice of science and engineeringin the 20th century,” Dongarra and Sullivan write. As with any top-10 list, their selections—and non-selections—are bound to becontroversial, they acknowledge. When it comes to picking the algorithmic best, there seems to be no best algorithm.

Without further ado, here’s the CiSE top-10 list, in chronological order. (Dates and names associated with the algorithms should be readas first-order approximations. Most algorithms take shape over time, with many contributors.)

1946: John von Neumann, Stan Ulam, and Nick Metropolis, all at the Los Alamos Scientific Laboratory, cook up the Metropolisalgorithm, also known as the Monte Carlo method.

The Metropolis algorithm aims to obtain approximate solutions to numerical problems with unmanageably many degrees of freedomand to combinatorial problems of factorial size, by mimicking a random process. Given the digital computer’s reputation for

deterministic calculation, it’s fitting that one of its earliest applications was the generation of random numbers.

1947: George Dantzig, at the RAND Corporation, creates the simplex method for linear programming.In terms of widespread application, Dantzig’s algorithm is one of the most successful of all time: Linear

programming dominates the world of industry, where economic survival depends on the ability to optimizewithin budgetary and other constraints. (Of course, the “real” problems of industry are often nonlinear; the useof linear programming is sometimes dictated by the computational budget.) The simplex method is an elegantway of arriving at optimal answers. Although theoretically susceptible to exponential delays, the algorithmin practice is highly efficient—which in itself says something interesting about the nature of computation.

1950: Magnus Hestenes, Eduard Stiefel, and Cornelius Lanczos, all from the Institute for Numerical Analysisat the National Bureau of Standards, initiate the development of Krylov subspace iteration methods.

These algorithms address the seemingly simple task of solving equations of the form Ax = b. The catch,of course, is that A is a huge n ! n matrix, so that the algebraic answer x = b/A is not so easy to compute.(Indeed, matrix “division” is not a particularly useful concept.) Iterative methods—such as solving equations of

the form Kxi + 1 = Kxi + b – Axi with a simpler matrix K that’s ideally “close” to A—lead to the study of Krylov subspaces. Namedfor the Russian mathematician Nikolai Krylov, Krylov subspaces are spanned by powers of a matrix applied to an initial“remainder” vector r0 = b – Ax0. Lanczos found a nifty way to generate an orthogonal basis for such a subspace when the matrixis symmetric. Hestenes and Stiefel proposed an even niftier method, known as the conjugate gradient method, for systems that areboth symmetric and positive definite. Over the last 50 years, numerous researchers have improved and extended these algorithms.The current suite includes techniques for non-symmetric systems, with acronyms like GMRES and Bi-CGSTAB. (GMRES andBi-CGSTAB premiered in SIAM Journal on Scientific and Statistical Computing, in 1986 and 1992,respectively.)

1951: Alston Householder of Oak Ridge National Laboratory formalizes the decompositional approachto matrix computations.

The ability to factor matrices into triangular, diagonal, orthogonal, and other special forms has turnedout to be extremely useful. The decompositional approach has enabled software developers to produceflexible and efficient matrix packages. It also facilitates the analysis of rounding errors, one of the bigbugbears of numerical linear algebra. (In 1961, James Wilkinson of the National Physical Laboratory inLondon published a seminal paper in the Journal of the ACM, titled “Error Analysis of Direct Methodsof Matrix Inversion,” based on the LU decomposition of a matrix as a product of lower and uppertriangular factors.)

1957: John Backus leads a team at IBM in developing the Fortran optimizing compiler.The creation of Fortran may rank as the single most important event in the history of computer programming: Finally, scientists

The Best of the 20th Century: Editors Name Top 10 Algorithms

In terms of wide-spread use, GeorgeDantzig’s simplexmethod is among themost successful al-gorithms of all time.

Alston Householder