a em theory

8/2/2019 A Em Theory

1/124

Contents

0 Solving Linear Equation Systems with the Gauss-Algorithm 6

1 Linear Algebra and Vector Spaces 1

1.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . 1

1.1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . 1

1.1.2 Linear Independence . . . . . . . . . . . . . . . 2

1.1.3 Dimension and Basis . . . . . . . . . . . . . . . 3

1.1.4 Scalar Product . . . . . . . . . . . . . . . . . . 5

1.1.5 Orthonormal Systems . . . . . . . . . . . . . . . 6

1.1.6 Norms . . . . . . . . . . . . . . . . . . . . . . . 8

1.2 Matrices and Linear Maps . . . . . . . . . . . . . . . . 9

1.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . 9

1.2.2 Linear Maps . . . . . . . . . . . . . . . . . . . . 12

1.2.3 Linear Equations . . . . . . . . . . . . . . . . . 14

1.2.4 Inverse map and Inverse Matrix . . . . . . . . . 15

1.2.5 Changing the Basis . . . . . . . . . . . . . . . . 17

1.2.6 Some Special Linear Maps in R2 . . . . . . . . . 18

1.2.7 Examples . . . . . . . . . . . . . . . . . . . . . 19

1.3 Operations with matrices . . . . . . . . . . . . . . . . . 19

1.3.1 Matrix-algebra . . . . . . . . . . . . . . . . . . 20

1.3.2 Scalar Product . . . . . . . . . . . . . . . . . . 21

1.3.3 Homogeneous Coordinates . . . . . . . . . . . . 21

1.3.4 Norms . . . . . . . . . . . . . . . . . . . . . . . 22

1.4 Gauss Algorithm and LU-Decomposition . . . . . . . . . 241.4.1 Numerical Stability . . . . . . . . . . . . . . . . 24

1.4.2 Special Operations . . . . . . . . . . . . . . . . 26


2/124

AEM 0- 2

1.4.3 Properties ofC(k, l; ), D(k; ) and F(k, l) . . 27

1.4.4 Standard Algorithm . . . . . . . . . . . . . . . . 27

1.4.5 LU-Decomposition . . . . . . . . . . . . . . . . 28

1.4.6 Example . . . . . . . . . . . . . . . . . . . . . . 311.4.7 Summary of LU-decomposition . . . . . . . . . . 33

1.4.8 Example of LU-Decomposition . . . . . . . . . . 34

1.4.9 Solving a Linear Equation System . . . . . . . . 36

1.4.10 Short Form . . . . . . . . . . . . . . . . . . . . 37

1.4.11 Example . . . . . . . . . . . . . . . . . . . . . . 37

1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . 40

1.5.1 Definition and properties . . . . . . . . . . . . . 401.5.2 More properties . . . . . . . . . . . . . . . . . . 41

1.5.3 Lemma . . . . . . . . . . . . . . . . . . . . . . 41

1.5.4 Theorem: Schur Form . . . . . . . . . . . . . . 41

1.5.5 Consequences . . . . . . . . . . . . . . . . . . . 42

1.5.6 Jordan-Form . . . . . . . . . . . . . . . . . . . 42

1.5.7 Example . . . . . . . . . . . . . . . . . . . . . . 45

1.6 Special Properties of Symmetric Matrices . . . . . . . . 511.6.1 Properties of Symmetric and Hermitian Matrices 52

1.6.2 Orthogonal Matrices . . . . . . . . . . . . . . . 52

1.7 Singular Value Decomposition (SVD) . . . . . . . . . . 53

1.7.1 Preparations . . . . . . . . . . . . . . . . . . . 53

1.7.2 Existence and Construction of the SVD . . . . . 54

1.8 Generalized Inverses . . . . . . . . . . . . . . . . . . . . 55

1.8.1 Special case: A injectiv . . . . . . . . . . . . . . 571.9 Applications to linear equation systems . . . . . . . . . 58

1.9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . 58

1.9.2 Numerical Rank Deficiency . . . . . . . . . . . . 59

1.9.3 Application: Best Fit Functions . . . . . . . . . 61

1.10 Symmetric Matrices and Quadratic Forms . . . . . . . . 64

1.11 QR-Decomposition . . . . . . . . . . . . . . . . . . . . 68

1.12 Numerics of eigenvalues . . . . . . . . . . . . . . . . . . 71


3/124

AEM 0- 3

2 Ordinary Differential Equations 2

2.1 General Definitions . . . . . . . . . . . . . . . . . . . . 2

2.2 Linear differential equations with constant coefficients . 4

2.2.1 Inhomogeneous Equations . . . . . . . . . . . . 72.3 Linear differential equations of higher order . . . . . . . 8

2.3.1 General Case . . . . . . . . . . . . . . . . . . . 8

2.3.2 Ode with Constant Coefficients . . . . . . . . . 10

2.3.3 Special Inhomogeneities . . . . . . . . . . . . . 11

3 Calculus in Several Variables 3

3.1 Differential Calculus in Rn . . . . . . . . . . . . . . . . 33.1.1 Definitions . . . . . . . . . . . . . . . . . . . . 3

3.1.2 Examples and Properties of Open and Closed Sets 4

3.1.3 Main Rule for Vector-Valued Functions . . . . . 4

3.1.4 Definition - Limits and Continous Fuctions . . . 5

3.1.5 Definition - Partial Derivatives . . . . . . . . . . 5

3.1.6 Theorem of H.A. Schwarz . . . . . . . . . . . . 6

3.1.7 Definition: Derivative off . . . . . . . . . . . . 63.1.8 Higher derivatives . . . . . . . . . . . . . . . . . 7

3.1.9 Examples . . . . . . . . . . . . . . . . . . . . . 7

3.1.10 Directional derivative, Gfteaux derivative . . . . 7

3.1.11 Rules . . . . . . . . . . . . . . . . . . . . . . . 8

3.2 Inverse and Implicit Functions . . . . . . . . . . . . . . 8

3.2.1 Inverse Function Theorem . . . . . . . . . . . . 8

3.2.2 Application: Newtons method . . . . . . . . . . 93.2.3 Implicit Function Theorem . . . . . . . . . . . . 9

3.3 Taylor Expansions . . . . . . . . . . . . . . . . . . . . . 10

3.3.1 Nabla-Operator . . . . . . . . . . . . . . . . . . 10

3.3.2 Construction of Taylor Expansions . . . . . . . . 10

3.3.3 Taylors Theorem . . . . . . . . . . . . . . . . . 11

3.3.4 Calculation in the two-dimensional Case . . . . . 12

3.4 Extreme Values . . . . . . . . . . . . . . . . . . . . . . 143.4.1 Definition . . . . . . . . . . . . . . . . . . . . . 14

3.4.2 Neccesary Criterion . . . . . . . . . . . . . . . . 15


4/124

AEM 0- 4

3.4.3 Sufficient Criterion . . . . . . . . . . . . . . . . 15

3.4.4 Saddle Points . . . . . . . . . . . . . . . . . . . 1

4 Integral Transforms 24.1 Laplace Transform . . . . . . . . . . . . . . . . . . . . 2

4.1.1 Method of Calculation . . . . . . . . . . . . . . 2

4.1.2 Convolution . . . . . . . . . . . . . . . . . . . . 4

4.1.3 Some important Examples . . . . . . . . . . . . 4

4.1.4 Solution of Inital Value Problems . . . . . . . . 5

4.2 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 6

4.2.1 Theorem . . . . . . . . . . . . . . . . . . . . . 64.2.2 Definition . . . . . . . . . . . . . . . . . . . . . 6

4.2.3 Theorem . . . . . . . . . . . . . . . . . . . . . 7

4.2.4 Properties of the Coefficients . . . . . . . . . . 7

4.2.5 Real form of the Fourier Series . . . . . . . . . . 7

4.3 Fourier Transform . . . . . . . . . . . . . . . . . . . . . 8

4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . 8

4.3.2 Inverse Transform . . . . . . . . . . . . . . . . . 84.3.3 Convolution . . . . . . . . . . . . . . . . . . . . 9

4.3.4 Rules . . . . . . . . . . . . . . . . . . . . . . . 9

4.3.5 Sine and Cosine transform . . . . . . . . . . . . 10

4.3.6 More Properties . . . . . . . . . . . . . . . . . . 10

4.3.7 Calculation of the Fourier Transform . . . . . . 11

4.3.8 Gauss functions . . . . . . . . . . . . . . . . . . 11

4.3.9 Consequences . . . . . . . . . . . . . . . . . . . 124.3.10 Definition: Dirac sequence . . . . . . . . . . . . 1

4.3.11 Main Property of Dirac sequences . . . . . . . . 1

4.3.12 Delta Distribution . . . . . . . . . . . . . . . . . 1

5 Stability of Ordinary Differential Equations 2

5.1 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 2

5.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 35.3 Flow-box theorem . . . . . . . . . . . . . . . . . . . . . 3

5.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 3


5/124

AEM -1- 5

5.5 Theorem: Linear Case . . . . . . . . . . . . . . . . . . 4

5.6 Linearisation . . . . . . . . . . . . . . . . . . . . . . . . 4

5.7 PoincarS-Ljapunov Theorem . . . . . . . . . . . . . . . 4

5.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 45.9 Ljapunov Functions . . . . . . . . . . . . . . . . . . . . 1

5.9.1 Definition . . . . . . . . . . . . . . . . . . . . . 1

5.9.2 Theorem . . . . . . . . . . . . . . . . . . . . . 1


6/124

0 Solving Linear EquationSystems with theGauss-Algorithm

A linear equation system with m equations and n unknowns is given by

a11x1 + a12x2 + a1nxn = b1...

am1x1 + am2x2 +

amnxn = bm

Omitting the plus-signs and the variables this will be written down in the

short form a11 a12 a1n b1... ... ... ...

am1 am2 amn bm

In case of an homogeneous equation system (all bj are equal to zero)the last column is omitted, too.

Allowed operations are

multiply a row with a number unequal to zero exchange two rows

add a multiple of a row to another row The exchange of columns is only allowed is there is a row 0 added

that contains the names of the variables.


7/124

AEM 0- 7

Naturally the last column containing the bj-values must not be

exchanged with other columns.

The simplest form of the Gauss-Algorithm is to perform these steps:m1 Try to get a 1 into the upper left corner. If this is not possible,the algorithm stops.

m2 by adding suitable multiples of the first row to the rows below (andabove) generate zeroes in the rest of the column.

m3 Repeat the process in the subscheme without the first row andcolumn.

In the end (possibly after exchanging rows and columns) one has

xj1 xj2 xjk xjk+1 xjn1 0 0 c10 1 0 c2...

..

.

. ..

..

.

..

.

..

.

..

.0 0 1 ck0 0 0 0 ck+1...

......

......

0 0 0 0 cm

The first row contains the names of the variables.

The number k is called the rank of the equation system. The followingcases are possible:

(i) At least one of the values ck+1, . . . , c m is unequal to zero. Then

the system is not solvable.

(ii) If k = n = m then the system is uniquely solvable with xj1 = c1,

. . . , xjn = cn.

(iii) It is k < n and ck+1 =

= cm = 0. Then we can take the last

n k variables xjk+1 to xjn as parameters in the solution. With thisthe values of xj1 to xjk are uniquely determined for each choice of

the parameters.


8/124

AEM 0- 8

Example

2x1 +6x2 +2x4 = 10

x1 +3x2 +x3 +2x4 = 7

3x1 +9x2 +4x3 = 16

3x1 +9x2 +x3 +4x4 = 17

or

x1 x2 x3 x4

2 6 0 2 101 3 1 2 7

3 9 4 0 16

3 9 1 4 17

m1 Exchange rows 1 and 2.

x1 x2 x3 x41 3 1 2 7

2 6 0 2 103 9 4 0 16

3 9 1 4 17

m2 Add row 1 multiplied by (2) to row 2, multiplied by (3) to row3 and multiplied by 3 to row 4. This results in

x1 x2 x3 x4

1 3 1 2 70 0 2 2 40 0 1 6 50 0 2 2 4

m3 Now swap columns 2 and 4.

x1 x4 x3 x2

1 2 1 3 70 2 2 0 40 6 1 0 50 2 2 0 4

m4 Add row 2 to row 1, row 2 multiplied by (3) to row 3 and multi-

plied by (

1) to row 4. Then divide row 2 by (

2).


9/124

AEM 0- 9

x1 x4 x3 x21 0 1 3 30 1 1 0 2

0 0 7 0 70 0 0 0 0

m5 Leave row 4 away, divide row 3 by 7 and add row 3 to row 1 and

subtract it from row 2. Then we reach the final form

x1 x4 x3 x21 0 0 3 4

0 1 0 0 10 0 1 0 1

m6 The system is solvable. The variables behind the columns that

form an identity matrix are parameters; here this applies to x2.

With x2 = t one sees x1 = 4 3t, x4 = 1 and x3 = 1. So we canwrite the general solution as follows:

x1x2x3x4

=

4 3tt

1

1

=

4

0

1

1

+ t

31

0

0


10/124

1 Linear Algebra and VectorSpaces

1.1 Vector spaces

1.1.1 Vector Spaces

1.1.1.1 Definition

A real vector-space (short: VS) is a set in which two operations additionand multiplication are defined, and where the following rules hold:

u, v and w are elements of the vector-space, and are real numbers.

(i) u+ v = v + u, u+ (v + w) = (v + u) + w

(ii) There is a zero-vector 0 with v + 0 = v.

(iii) For each v there is a vector

v with v + (

v) = 0.

(iv) ( + )v = v + v , ()v = (v), (v + w) = v +

w , 1 v = vIf one admits complex scalars, one gets a complex vector-space instead

of a real VS.

The elements of the vector-space are called vectors. The elements of

the field R or C are called scalars.Often there is no difference wether one has R or C as field. In this case

we use K as a symbol.


11/124

AEM 1- 2

1.1.1.2 Definition: Subspace

Let V be a VS and U

V. U is called a subspace of V, if U is itself a

VS with the operations induced by V. This is fulfilled, iff (short for ifand only if) U contains for each pair of elements x and y the sum x+ y

and all vectors of the form x with K.This property is called closedness against sums and multiplications.

Always V has the trivial subspaces V and {0}.

1.1.2 Linear Independence

1.1.2.1 Definition

Let v1 to vk vectors and 1, . . . , k K. The expression1v1 + kvk is called linear combination. The numbers j are calledcoefficients. Please notice that a linear combination is always a finitesum, even in infinite-dimensional spaces.

The vectors v1 to vk are linearly dependent (l.d.), if there are coefficients

1 to k with 1v1 + + kvk = 0, and not all of the i are zero. Ifthis is not the case, the vectors are called linearly independent (l.i.).

Therefore, ifv1 to vk are linearly independent, and 1v1+

+kvk = 0,

it follows that 1 = 2 = = k = 0.On the other hand, if v1 to vk are l.d, then it is possible to write

1v1 + + kvk = 0 with at least one of the j = 0, say 1 = 0. Thenone has

v1 =11

(2v2 + kvk),

so one of the vectors is a linear combination of the others.


12/124

AEM 1- 3

1.1.2.2 Criteria for Linear Dependence

A single vector is linearly dependent, iff it is the zero-vector.

Two vectors uand v are linearly dependent, iff they lie on a straightline through zero; or iff one of them is a multiple of the other.

Three vectors u, v and w are linearly dependent, iff they lie ina plane through zero; or if one of them is a linear combination

of the others. In R3 there is a criterion with the volume of the

parallelepiped spanned by these vectors

v1, v2, v3 l.d. (v1, v2, v3) = det(v1, v2, v3) = 0

k vectors v1 to vk are linearly dependent, iff the rank of the matrixwith the columns v1 to vk is less than k. (rank will be explained

later).

More than n vectors in Kn are always linearly dependent.

Criterion for n vectors in Kn:v1 to vn are linearly dependent det(v1, . . . , vn) = 0.

1.1.3 Dimension and Basis

1.1.3.1 Definition: Span, Dimension and Basis

Let V be a vector space.

(i) Let M V be a (finite or infinite) non-empty subset of V. Theset of all linear combinations is called the span on M, span M =

{m

k=1kvk | j K, vj M}. The span is always a subspace.

(ii) If there is a system M of n vectors in V, so that V is the span of

M, and there is no such system consisting of less than n vectors,

then V has the dimension n.


13/124

AEM 1- 4

If there is no finite set M with span M = V, V is said to be

infinite-dimensional.

(iii) A set M = {v1, v2 vn} V is called a basis ofV, iff every vectorv V has an unique representation v =

nk=1

kvk.

1.1.3.2 Remarks

(i) If V has dimension n, then every basis consists of n elements.

(ii) If V has dimension n, then every linearly independent set of n

vectors forms a basis.

(iii) The elements of a basis are always linearly independent.

1.1.3.3 Coordinates

Let M = {v1, v2 vn} V be a basis of V. For each v V thereis a unique representation v =

nk=1

kvk. The numbers (1, . . . , n) are

called the coordinates ofv with respect to M. The vector

1...

n

(always

a column!) is called the coordinate vector of v.


14/124

AEM 1- 5

1.1.4 Scalar Product

1.1.4.1 Complex scalar product

Let Vbe a complex vectorspace. A scalar product is a mapping VVC, (v , w) < v , w > with the properties

(i) < u + v , w >= < u, w > + < v , w > for , C,u, v , w V (linearity in the first argument)

(ii) < u, v + w >= < u, v > + < u, w > for , C,u, v , w V (anti-linearity in the second argument)

(iii) < u, v >= < v , u >

(iv) < u, u > 0 and < u, u >= 0 u = 0 (positive definiteness)Esp. the scalarproduct of a vector with itself is always real and

non-negative.

1.1.4.2 Real scalar product

If V is a real vectorspace, the same properties shall hold with real valued

scalar product, , R and (naturally) without complex conjugation.

1.1.4.3 Standard scalar product

The standard real resp. complex scalar product of two vectors in Kn is

defined by

v w =< v , w >:=n

k=1

vkwk v , w Rn

v w =< v , w >:=n

k=1

vkwk v , w Cn

In this case we define


15/124

AEM 1- 6

(i) u := < u, u > is the length or (euclidean) norm of the vectoru (also denoted by |u|).

(ii) The angle [0, ] of two non-zero vectors u, v Rn

is definedby cos =

< u, v >

u v .

1.1.5 Orthonormal Systems

With the Kronecker symbol i j = 1 i = j0 i = j we define1.1.5.1 Definition

(i) Two vectors having scalar product zero are called orthogonal or

perpendicular.

(ii) A set of vectors {vi} with < vi, vj >= i j is called an orthonormalsystem (ONS). A basis that is an ONS is called orthonormal basis

(ONB).

1.1.5.2 Lemma

ONS are linearly independent.

The importance of ONB lies in the following theorem, which allows an

expansion of a given vector in the basis with aid of scalarproducts:

1.1.5.3 Expansion Theorem

Let v1, , vk be an ONS, V the span of these vectors.


16/124

AEM 1- 7

(i) If u V, then the following holds:u =< u, v1 > v1+ < u, v2 > v2 + + < u, vk > vk

=

kj=1

< u, vj > vj

(ii) For V U and u U there exists a decomposition u = u1+u2 withu1 V and < u1, u2 >= 0. u1 is called the orthogonal projectionof u, and the map u u1 is the orthogonal projection onto V.

1.1.5.4 Gram-Schmidt Orthonormalisation Process

Let u1, , uk be a set of vectors, in which at least one non-zero vectorexists.

m1 Choose u1 = 0, let v1 = u1 and set w1 = 1v1v1.

m2 If w1 to wj1 are already constructed, letvj = uj < uj, w1 > w1 < uj, wj1 > wj1= uj

j1i=1

< uj, wi > wi.

Then span {u1 . . . uj} = span {v1 . . . vj} and< vj, v1 >= =< vj, vj1 >

=< vj, u1 >= =< vj, uj1 >= 0.In manual computations it is often easier to use the vi instead of

the wi:

vj = uj < uj, v1 >< v1, v1 >

v1 < uj, vj1 >< vj1, vj1 >

vj1

= uj j1

i=1< uj, vi >

vi

2

vi.

As the vj will be normed later, it is allowed to substitute the vj with

a multiple. With this technique one can avoid sometimes the use

of fractions.


17/124

AEM 1- 8

m3 If vi = 0 then let wj = 1vj vj and go on withm2 . If one is

calculating with vj instead of wj this step can be carried out in the

end.

If vj = 0 then uj was linearly dependent of u1 to uj1. In this caseuj is deleted from the starting set of vectors and the algorithm

goes on with the next vector.

If the ui are linearly independent this case cannot occur.

1.1.6 Norms

1.1.6.1 Definition

A norm on a vector space V is a function . : V R, x x Rwith the following properties:

(i) x 0 and x = 0 x = 0 (definiteness)(ii) x = ||x (homogeneity)

(iii) x + y x + y (triangle inequality)

1.1.6.2 Examples

(i) The euclidean norm on Kn x2 = |x| = x, x(ii) The 1-norm: x1 = |x1| + |x|2 + + |x|n

(iii) The -norm: x = max{|x1|, |x|2, , |x|n}

(iv) On C([a, b]) we define f2 :=b

a|f(x)|2 dx

1/2

Remark In (i)(iii) we have ek = 1.


18/124

AEM 1- 9

1.1.6.3 Lemma: Cauchy-Schwarz and Minkowski inequalities

Let

., .

be a real or complex scalar product, i.e.

., .

is linear in the

first argument and u, v = v , u with u, u = 0 u = 0.(i) | u, v | u, u 1/2 v , v 1/2

(ii) With u := u, u 1/2 is a norm, especially u+ v u + v.(i) is called Cauchy-Schwarz inequality, (ii) is the Minkowski inequality.

1.1.6.4 Comparison of norms

It is easy to see that x x2 x1 nx holds. Therefore,one can define: a sequence xk approaches zero if the real sequence xkhas the limit zero, and the choice of the norm doesnt make a difference.

Naturally xk x (xk x) 0 xk x 0.

1.2 Matrices and Linear Maps

1.2.1 Matrices

1.2.1.1 Definition

In the most cases it is sufficient to regard a matrix as a rectangular

scheme consisting of column-vectors:

A = (ai j) i=1..mj=1..n

=

a11 a12 a1na21 a22 a2n

......

. . ....

am1 am2 amn

=

| | |a1 a2 an...

......

...

| | |

A matrix with an equal number of rows and columns is called square

matrix.


19/124

AEM 1- 10

1.2.1.2 Special types of square matrices

1 0 0 00 1 0 00 0 1 0...

......

. . ....

0 0 0 1

Identity-matrix

En or Inor E or I

d1 0 0 00 d2 0 00 0 d3 0...

......

. . ....

0 0 0 dn

diagonal-matrix

0 0 0 0 0 0...

......

. . ....

lower

triangular matrix

0 0 0 ...

......

. . ....

0 0 0

uppertriangular matrix

Two matrices of the same size can be added by adding all entries. A

matrix is multiplied by a scalar by multiplying each entry by .

A =

a11 a12 a21 a22

......

. . .

, B =

b11 b12 b21 b22

......

. . .

,

A =

a11 a12 a21 a22 ...

.... . .

A + B =a11 + b11 a12 + b12 a21 + b21 a22 + b22

......

. . .

,

1.2.1.3 Multiplication of Matrices and Vectors

Let A be a matrix with k columns and b be an element ofKk.

The product of the matrix A and the vector b = (b1, . . . , b k)T is the

linear combination of the column-vectors of A with the coefficients b1


20/124

AEM 1- 11

to bk.

|

a1|

|ak|

b1...

bk = b1a1 + + bkak

The matrix A is multiplied with the matrix B by decomposing B into

column-vectors and forming the corresponding matrix-vector-products.

These products are written down in order.

A |b1|

|bk| = |A b1

| |A bk

|

So the matrix-product is calculated in concrete situations:

ai1 ai2 ai3

b3j

b1jb2j

ci j

C = ABA

B

......

...

ci j = ai1b1j + ai2b2j + ainb=

nk=1

aikbkj

ci j = ai1b1j + ai2b2j + ainbnj =n

k=1

aikbkj

On the other hand, if you define matrices as an (ordered) collection of

row-vectors, the product bA consists of linear combinations of the rows

of A with coefficients in b. Observe the order of multiplication!

This leads to:

The product AB of the matrices A and B is a matrix with

the k-th column is a linear combination of the columns of A withcoefficients in the k-th columns of B


21/124

AEM 1- 12

the k-th row is a linear combination of the rows of B with coeffi-cients in the k-th row of A

1.2.2 Linear Maps

1.2.2.1 Definition: Linear map

Let U and V be vector-spaces. A map L : U V is called linear, if forall x , y U and , K the following equation holds:

L(x + y) = L(x) + L(y)

If u1. . . un is a basis of U then L is completely determined by its action

on the basis:

Lu = L

n

i=1iui

=

n

i=1iL(ui)

Suppose that V has a basis v1. . . vm. Then each Lui has a representation

Lui =m

j=1

aj ivj. The matrix A = (aj i)j=1..m,i=1..n is called the matrix

associated to the linear map L. Note that this matrix depends not only

on L itself, but also on the choice of the bases in U and V.

Resuming this for the special case U = Kn and V = Km with the standard

bases we have:The matrix of the linear map L : U V has in the k-th column theimage of ek.

On the other hand every matrix with n rows and m columns defines a

linear map Km Kn through L(x) := Ax.

1.2.2.2 Definition: Rank

The rank of a matrix is the rank of the corresponding homogeneous

equation system defined in chapter 0.


22/124

AEM 1- 13

1.2.2.3 Rank theorem

Let A be a matrix. Then the maximum number of linear independent

columns is equal to the maximum numbers of linear independent rows.

A matrix with the property that the rank is the minimum number of rows

and columns is called to have full rank.

1.2.2.4 Definition: Multilinear Maps

(i) Let U1, . . . , Un und V be vectorspaces. A mapL : U1 U2 Un VL : (u1, u2, . . . , u n) L(u1, . . . , u n) Vis called multilinear ifL is linear in each component, i.e. L is linear

in each uj if one fixes all other uk.

(ii) Most important case: U1 = = Un.

For n = 2 we have bilinear maps. They are called symmetric istL(u, v) = L(v , u) and hermitian if L(u, v) = L(v , u).

A multilinear map with the property

L( , uj, , uk, ) = L( , uk, , uj, )

is called alternating.

Properties of alternating maps:

(iii) (1) L( , u, , u, ) = 0(2) If one of the vectors is a linear combination of the others,

L( ) = 0.

(3) For U1 = = Un = Kn

and V =K

there is exactly one Lwith

L(e1, , en) = 1.


23/124

AEM 1- 14

In this case we have L(u1, , un) u1, , un lin. indepen-dant.

This L is called determinante L( u1, , un) = det( u1, , un)(and is the well known determinante with the usual properties)(iv) Application: Cramers Rule

Let a1, , an Kn be a basis, A = [a1, , an] a n n-matrixand b Kn.Then the equation system Ax = b is uniquely solvable with

xj =det A

jdet A , where A

j is A with aj is replaced by b.

1.2.3 Linear Equations

1.2.3.1 Some Definitions

A linear map L : U K is called linear functional.Let L : U V be linear. Then L is called

epimorphism, if L is surjective isomorphism, if L is bijective (one-to-one) endomorphism, if U = V

automorphism, if U = V and L is one-to-one.The rank of L is the dimension of the range of L in V. As the range is

spanned by the column-vectors of the matrix representation, the rank of

L is the rank of the corresponding matrix.

1.2.3.2 Definition: Linear equation

Let L : U V be a linear map, b V. An equation Lx = b iscalled linear equation. For b = 0 the equation is called homogeneous,


24/124

AEM 1- 15

otherwise inhomogeneous. The set of all solutions of the homogeneous

equation is called the kernel of L, written ker L.

From now on we assume that L is represented by the matrix A.

1.2.3.3 Immediate Properties

(i) The kernel is a subspace ofU

(ii) For the homogeneous equation the dimension formula holds:

dim ker L = dim U rank LThat means that one can choose freely n k parameters in thesolution of the equation Lx = 0.

(iii) The general solution of the inhomogeneous equation is archived by

adding one particular solution to all solutions of the homogeneous

equation.

(iv) The inhomogeneous equation is solvable iff the rank of A is equalto the rank of the extended matrix (A|b).(v) For square n n-matrices A the following holds:

The inhomogeneous equation is solvable for each right side b

The homogeneous equation is uniquely solvable det A = 0 A has rank n ker A = {0} A1 exists (A1 is defined below).

1.2.4 Inverse map and Inverse Matrix

Let Lx = b be a linear equation that is uniquely solvable for all b.

Then the map b x is well defined, and this map is called L1, theinverse map of L.


25/124

AEM 1- 16

1.2.4.1 Consequences

(i) L1 is a linear map from V to U.

(ii) In the finite dimensional case the matrix associated to L must be

square.

Let A be a n n-square matrix with rank n. Then each equation systemAx = b is uniquely solvable. The matrix B = [v1, . . . , vn] containing the

solutions Avj = ej is called the inverse of A, A1 = B.

A is called regular or invertible.

1.2.4.2 Properties

A1A = AA1 = E

From now on we restrict ourselves to the case that the linear map is

defined between Rn

and Rm

or between Cn

and Cm

.

1.2.4.3 Correspondences between Linear Maps and Matrices

Linear Map L Matrix A

Application to a vector L(x) Multiplication matrix - vector Ax

Identity map I(x) = x Identity matrix E with Ex = x

Zero map: O(x) = 0 Zero matrix 0 with 0x = 0

Composition L1 L2 Matrix-multiplication A1A2Inverse map L1 Inverse matrix A1


26/124

AEM 1- 17

1.2.5 Changing the Basis

In the beginning of the section was mentioned that the matrix of a given

map L : Kn Km contains in the columns the coordinates of the imagesof the basis ofKn with respect to the basis ofKm. Now we can ask how

the matrix changes when we choose other bases in Kn or Km.

1.2.5.1 Coordinates with Respect to a Basis

Let u1, , un be a basis ofKn

. Then the matrix U = (u1 un) is in-vertible. To gain the coordinates a of a point x with respect to u1, , unwe write

x = Ua a = U1x .If v1 vn is another basis ofKn we have with V = (v1 vn)

x = Ua = Vb b = V1Ua a = U1Vb

1.2.5.2 Matrix and Change of Coordinates

This uses the same method as in the paragraph above: x Kn hasthe representations x = Ua = Vb and y Km has the representationsy = W c = Zd.

Let A be the matrix of L with respect to the basis U and W. Using the

last paragraph we have

L(x) = y Aa = c AU1Vb = W1Zd Z1W AU1Vb = d.A special case is the change of basis of an endomorphism: With W = U

and Z = V the last formula reduces to

Aa = c

V1UAU1Vb = d

(V1U)A(V1U)1b = d

In the even more special case U = W = E we have

Aa = c V1AVb = d .


27/124

AEM 1- 18

1.2.6 Some Special Linear Maps in R2

(i) Identity and zero maps E and 0.

(ii) Homogeneous scaling E =

0

0

(iii) Rotation with the angle

cos sin sin cos

.

(iv) Shears as 1 1

0 1(v) Reflections.

Let a = 1 and g be the straight line with direction a. Thereflection at g has the matrix

Sg =

2a21 1 2a1a2

2a1a2 2a22 1

= 2aaT E

(vi) The reflection at zero has the matrix E = 1 00 1.


28/124

AEM 1- 19

1.2.7 Examples

1 2 3 4

5 6 7 8

1 :

1 0

0 1

2 :

1.5 0

0 0.5

3 :

0.75 0

0 1

4 :

1 1

0 1

5 :

1 00 1

6 :

1 0

0 1

7 :

1 00 1

8 :

a aa a

, a =

12

1.3 Operations with matrices

The transpose AT of a matrix A is the matrix with columns and rows ex-

changed. The transpose of a mn-matrix is a nm-matrix. This meansfor square matrices, that everything is mirrored at the first diagonal.

The adjoint A of an (complex) matrix A is constructed by replacing allentries of the transpose by their complex conjugates.

A square matrix is called symmetric, if it is equal to its transpose. It

is called self-adjoint or hermitian, if it is equal to its adjoint. For real

matrices these term coincide.

A matrix with A = AT is called skew-symmetric, if A = A, A iscalled skew-hermitian.


29/124

AEM 1- 20

Often it is useful to regard vectors as matrices with one column and n

rows. The numbers in R or C correspondent to the 1 1-matrices.

1.3.1 Matrix-algebra

A+B = B +A (A+B) = A+B (A+B)+C = A+(B +C)

(A + B)C = AC+ BC A(B + C) = AB + AC (AB)C = A(BC)

Attention! In general is AB = BA.Let A and B be invertible n n-matrices. Then AB is invertible and thefollowing rules hold:

(AB)1 = B1A1 (A)1 =1

A1.

AE = EA = A A0 = 0A = 0

(A1)1 = A (AT )T = A (A) = A

(A + B)T = AT + BT (A)T = AT (AB)T = BTAT

(A + B) = A + B (A) = A (AB) = BA

(A1)T = (AT )1 (A1) = (A)1

1.3.1.1 Block Matrices

If a matrix is divided into blocks by horizontal or vertical lines one can

calculate with these block as if they were entries in a common matrix

(exception: determinants!). The blocks have to fit in size. Example:

A1 A2A3 A4 B1 0

Ek B2 = A1B1 + A2 A2B2

A3B1 + A4 A4B2Here O denotes a matrix consisting only of zeroes and Ek a kk identitymatrix.


30/124

AEM 1- 21

1.3.2 Scalar Product

The role of the transpose resp. adjoint matrix becomes clearer if we if

we regard the scalar product as a matrix product:

< u, v >=n

i=1

uivi = vu (complex case)

< u, v >=n

i=1

uivi = vT u (real case).

So we have

< Au, v >= vTAu = vTAT T u = (ATv)T u =< u, ATv >

and analogously in the complex case < Au, v >=< u, Av >.

This property characterizes the transpose matrix:

Let < Au, v >=< u, Bv > for all u, v Rn. If one chooses u = ei andv = ej one has < Aei, ej >= aj i and < ei, Bej >= bi j, so B = A

T.

1.3.3 Homogeneous Coordinates

With matrix multiplication one can describe rotations, stretchings, shear-

ings or reflections (and combinations of these), but as the origin always

remains fixed, translations are not possible. This difficulty can be over-

come by using homogeneous coordinates. Homogeneous coordinates inR3 consist of four coordinates, where the fourth coordinate must not be

zero. A point (x , y , z ) R3 is represented by any vector of the form[a x , a y , a z , a]T. Especially [x , y , z , 1]T is a representant of [x , y , z ]T.

Then we have the following correspondeces:


31/124

AEM 1- 22

cartesian coordinates homogeneous coordinates

x =

x1x2

x3

y =

x1x2x31

or y =

ax1ax2ax3

a

x Ax y By with B =

0A 0

0

0 0 0 1

x x + v y By with B =

1 0 0 v10 1 0 v20 0 1 v30 0 0 1

1.3.4 Norms

Definition of norms of linear maps Let U and V be normed vectorspaces and let L(U, V) denote the vectorspace of all linear maps from

U to V. A norm on L(U, V) is a real valued function with the following

properties:

If A, B L(U, V) then(i) A 0 and A = 0 A = 0, the zero-map (definiteness)

(ii) A = ||A (homogeneity)(iii) A + B A + B (triangle inequality)(iv) AB AB


32/124

AEM 1- 23

In the finite dimensional case linear maps are represented by matrices,

and the norm is called matrix-norm. Other notation: operator-norm

In general, a vector-norm . a and a matrix-norm . b are compatibleif for each vector x and each matrix A the inequality Ax Axholds. The norm-definition below produces compatible matrix-norms.

Definition Let . i be a (vector)-norm in Kn and A be a nn matrix.We define the matrix-norm A generated by . by

A = max{Ax | x = 1} = max{Ax | x 1}.Then one has A = min{C | for all x U one has Ax Cx}The norm generated by the vector-norms . 1 and . above aredenoted by the same symbol.

Lemma

(i) A1 = max1jn

ni=1

|ai j| (largest sum of columns)

(ii) A2 is the first (and largest) singular value of A (will be definedlater)

(iii) A = max1inn

j=1 |ai j| (largest sum of rows)

(iv) As = n

i ,j=1

|ai j|2 is compatible with . 2 (Frobenius Norm).


33/124

AEM 1- 24

1.4 Gauss Algorithm and LU-Decomposition

1.4.1 Numerical Stability

We will study some small equation systems and the effect of rounding

errors onto the solutions.

Example system:

104x + y = 1

x + y = 2

Solution with the Gaua-Algorithm, exact calculation:1

100001 1

1 1 2

110000

1 1

0 9999 9998

110000

1 1

0 1 99989999

110000 0 199990 1 9998

9999

1 0 1000099990 1 9998

9999

and so x 1 and y 1Now the same calculation with three significant digits, i.e. all numbers

are rounded to the next number with three digits of the form x = 0.abc10p.

0.0001 1 1

1 1 2

0.0001 1 1

0 10000 10000

0.0001 1 1

0 1 1

0.0001 0 0

0 1 1

1 0 0

0 1 1

x = 0

y = 1

This solution is unusable.

This can be avoided by pivoting: choose the entry in the first column

below the diagonal (the diagonal included) with the largest absolute value

and put it into the diagonal by exchanging rows. Then go on with Gaua


34/124

AEM 1- 25

algorithm. IfA is invertible then the pivot elements are unequal to zero.

This results in the following:

0.0001 1 11 1 2

1 1 20.0001 1 1

1 1 20 0.9999 0.9998

1 1 2

0 1 1

1 0 1

0 1 1

Other problems may arise. Example two is example one after multiplying

row 1 by 20000. Again the calculations use three significant digits.2 20000 20000

1 1 2

2 20000 20000

0 10000 10000

2 20000 20000

0 1 1

2 0 0

0 1 1

x = 0

y = 1

So this solution is unusable, too.This effect can be avoided by equilibration.This means that each equa-

tion is multiplied with a factor so that the sum of the absolute values of

the row,

nk=1

|aik| is equal to one.

Applying this one gets

2 20000 200001 1 2

220002

2000020002

2000020002

12

12

1

0.0001 1 1

0.5 0.5 1

Then pivoting gives

0.5 0.5 1

0.0001 1 1 0.5 0.5 1

0 1 1 x = 1

y = 1

Conclusion: pivoting and equilibration can help to avoid problems caused

by rounding errors.


35/124

AEM 1- 26

1.4.2 Special Operations

Always we assume that the sizes of the matrices fit so that the products

can be performed.

Let be a real or complex number, k = l. Now define the followingn n- square matrices:Definition

C(k, l; ) = (ci j)i ,j=1..n with ci j = 1 for i = j

for i = k, j = l

0 otherwise

D(k; ) = (di j)i ,j=1..n with di j =

1 for i = j and i = k for i = j = k

0 otherwise

F(k, l) = (fi j)i ,j=1..n with fi j = 1 for i = j and i = k and i = l1 for i = k, j = l

1 for i = l , j = k0 otherwise

Decompose A into column-vectors ai and row-vectors bj:

A = ak al =

bk bl

Multiplication from the left side does operations with rows:

C(k, l; )A =

bk + bl

bl

D(k; )A =

bk bl

F(k, l)A =

bl bk

Multiplication from the right side does operations with columns:


36/124

AEM 1- 27

AC(k, l; ) = ak al + ak ,

AD(k; ) =

ak al AF(k, l) = al ak

Observe that multiplication with C(k, l; ) from the right changes col-

umn l while multiplication from the left changes row k.

1.4.3 Properties of C(k, l; ), D(k; ) and F(k, l)

(i) C(k, l; )1 = C(k, l; )(ii) C(k, l; )C(k, m;) = C(k, m;)C(k, l; )

(iii) C(k, l; 0) = E

(iv) For = 0 we have D(k; )1 = D(k; 1)(v) F(k, l)1 = F(k, l) = F(k, l)T

1.4.4 Standard Algorithm

Standard operations in the Gauss algorithm are

(i) adding row l multiplied by to row k

(ii) multiplying row k by

= 0

(iii) exchanging rows k and l.

These operations can be described with aid of the fundamental matrices

C(k, l; ), D(k; ) and F(k, l). To see this we write the system Ax = b

as an augmented matrix S = (A|b).Then the operations (i) to (iii) from above are

(i) multiply S with C(k, l; ) from the left(ii) multiply S with D(k; ) from the left

(iii) multiply S with F(k, l) from the left.


37/124

AEM 1- 28

As all appearing matrices are invertible we see that the Gauss algorithm

gives equivalent transformations and so preserves the set of solutions.

If the system Ax =b is uniquely solvable it is sufficient to reach theform

d11 . . . ...0 dnn

From the last equation it one can read directly the value of xn, and

substituting the already determined variables the solution is calculated

recursively from the bottom to the top.

1.4.5 LU-Decomposition

The LU-decomposition

is an effective method in solving many equations with the sameleft side decomposes a given square matrix A as A = P LU with

(i) P is a permutation matrix, i.e. P has exactly one 1 in each

column and row, and all other entries are zero.

(ii) L is a lower triangular matrix

(iii) U is an upper triangular matrix

1.4.5.1 Description of the Algorithm - simple case with P=E

The algorithm consists of a series of transformations of the matrix A.

With L0 = E, U0 = A we calculate

A = EA = L0U0 = LkUk = = LnUn =: LU.

The matrices Lk and Uk are block-diagonal as shown:


38/124

AEM 1- 29

Lk =

1 0. .

. 10

1 0

. . .

0 1

k

Uk =

u1 . .

.0 uk

0

. . .

k

Lk1 =

1 0. . .

10

1 0

. . .

0 1

k1

Uk1 =

u1 . . .

0 uk1

0

z . . .

k1

m1 We start with A = Lk1Uk1. Let Uk1 = (ui j).

During this simple case we assume that z = ukk = 0.To each row from row k + 1 to the last in Uk1 we add the row

k multiplied with j :=

ujk

ukk. This results in having zeroes in

column k from row k + 1 to the bottom.

These actions expressed with matrices: Uk1 is multiplied from theleft with C( j,k; j).

Recall the facts that the inverse of C( j,k; j) is C( j,k; j) andthat matrices C( j,k; ) and C(i , k;) commute. So we have

A = Lk1Uk1 = Lk1C(k + 1, k; k+1)C(k + 1, k; k+1)Uk1= Lk1C(k + 1, k; k+1)C(k + 1, k; k+1)

C(n, k; n)C(n, k; n)Uk1


39/124

AEM 1- 30

=

Lk1C(k + 1, k; k+1) C(n, k; n)

C(k + 1, k; k+1) C(n, k; n)Uk1=: LkUk.How is Lk build from Lk1?

The action of the matrices is adding multiples of the columns k+1

to n to column k. Obviously only column k is changed by this

process, and contains in the places k + 1 to n the negative of the

factors used by the transforming of Uk1.As an example we write down L1 in the case U0 = A = (ai j):

L1 =

1 0 0a21a11

1 0...

.... . .

...an1a11

0 1

m2 Recursively now repeat step m1

When the algorithm ends we have

A = LU with

L =

1 0 0 1 . . . ...... . . . . . . 0 1

and U =

u11 0 u22

. . ....

.

.. . . . . . . 0 0 unn

The ui i are non-zero.

m3 Now we have

Ax = LUx = L Uxy = Ly =b

(i) Ly = b is solved recursively beginning with the first compo-

nent of y.


40/124

AEM 1- 31

(ii) Ux = y is solved recursively beginning with the last compo-

nent of x.

1.4.5.2 Remark

det A = det L det U = u11 unn.

1.4.6 Example

Let A =

1 2 42 3 81 3 1

and b =36

0

. Solve Ax = bm1 Start with the LU-decomposition of A.

[L0|U0] = 1 0 0 1 2 4

0 1 0 2 3 80 0 1 1 3 1 [L1|U1] =

1 0 0 1 2 42 1 0 0 1 0

1 0 1 0 1 3

[L2|

U2

] = 1 0 0 1 2 4

2 1 0 0

1 0

1 1 1 0 0 3 m2 Solve Ly = b.

Line by line one has y1 = 3, y2 = 0 and y3 = 3.

m3 Solve Ux = y.

Line by line (from the bottom to the top) one has

x3 = 1, x2 = 0 and x1 = 1, so x =1 0 1T.


41/124

AEM 1- 32

1.4.6.1 LU-decomposition, general case

This general case brings two extensions:

A may be singular pivoting is possible

Now we construct a decomposition A = P LU. We start with P0 := E,

L0 = E and U0 = A

If the element z in Uk1 is zero and in the rest of the column k there

are only zeroes, too, then the matrix A is singular. In this case letUk := Uk1 and Lk := Lk1. We will get a LU-decomposition of A withsome diagonal elements of U being zero. This can only happen is A is

singular.

If in the row l > k of the column k there is an entry with an larger

absolute value, then exchange the rows k and l of Uk1.

This is a multiplication of Uk1 from the left with F(k, l). RememberingF(k, l)F(k, l) = E we get

A = Pk1Lk1Uk1 =

Pk1Lk1F(k, l)

F(k, l)Uk1

=:

Pk1Lk1Fki

Uk1.

The matrixUk1 is Uk1 with rows i and k exchanged and therefore hasa non-zero element in position z.

The action of right multiplication with F(k, l) on Lk1 is interchangingcolumns k and i. As these columns consist of zeroes with only one 1

in each case this can be undone by interchanging the rows k and i, i.e.

multiplying Lk1 with F(k, l) from the left. But doing so interchangesthe first k

1 positions of these rows too, so that one has to undo this.

Resuming this we have this step in the algorithm: Set Pk := Pk1F(k, l)and Lk1 is Lk1 with the first k 1 columns of the rows k and linterchanged.


42/124

AEM 1- 33

1

1

1

kl

Pk1 Pk Lk1 Lk1 Uk1 Uk1

lk

k l

Now construct Uk and Lk as in the simple case from Uk1 and Lk1

and get A = PkLkUk.

In the end we have P1 = PT. As P is a product of matrices F(k, l)and F(k, l)1 = F(k, l)T this is true for P, too, because of:

Let AT = A1 and BT = B1. Then (AB)T = BTAT = B1A1 =(AB)1.

1.4.7 Summary of LU-decomposition

Solving a linear equation system Ax = b with LU-decompostion consists

of the following steps:

m1 Start with P0 = L0 = En, U0 = A.

m2 For each k from 1 to n perform

Exchanging rowsUk1 is Uk1 with rows k and l > k exchanged, Lk1 is Lk1where the first k 1 entries in rows k and l are exchanged(only if k > 1), and exchanging columns k and l in Pk1 gives

PkIf you skip this step just put Pk := Pk1, Lk1 := Lk1 andUk1 := Uk1


43/124

AEM 1- 34

Adding multiples of row k to the rows belowAdding in Uk1 the l-fold row to the rows l with l > kgives Uk, and Lk is Lk

1 with entries

l in row l of

column k.

With P := Pn, L := Ln and U := Un this gives the decomposition

A = P LU.

In case of different right sides bj in the equation system, this step

has to be carried out only once.

m3 Solve P z = b by z = PT b

m4 Solve Ly = z recursively starting with y1.

m5 Solve Ux = y recursively starting with xn.

At an arbitrary point you can make a crosscheck whether you made

mistakes during the calculation: always PkLkUk and PkLk1Uk1 must

be equal to A.

1.4.7.1 Remarks

(i) The first step in the LU-Decomposition can be used to do pivoting;

i.e. you can always put the entry with the largest absolute value

into the umm-position. This results in higher numerical stability.

(ii) P arises from the identity-matrix by interchanging rows. Thereforeit is not necessary to write down the complete matrix. One only

has to keep notice what coordinates are interchanged.

1.4.8 Example of LU-Decomposition

A =

6 5 3 103 7 3 512 4 4 40 12 0 8


44/124

AEM 1- 35

[P0|L0|U0] = [E|E|A]

=

1 0 0 0 1 0 0 0 6 5 3 100 1 0 0 0 1 0 0 3 7 3 50 0 1 0 0 0 1 0 12 4 4 4

0 0 0 1 0 0 0 1 0 12 0 8

[P1|L0|U0] = 0 0 1 0 1 0 0 0 12 4 4 4

0 1 0 0 0 1 0 0 3 7 3 51 0 0 0 0 0 1 0 6 5 3 100 0 0 1 0 0 0 1 0 12 0 8

[P1|L1|U1] =

0 0 1 0 1 0 0 0 12 4 4 4

0 1 0 0 1/4 1 0 0 0 6 4 41 0 0 0 1/2 0 1 0 0 3 1

12

0 0 0 1 0 0 0 1 0 12 0 8

[P2|L1|U1] =

0 0 1 0 1 0 0 0 12 4 4 4

0 0 0 1 0 1 0 0 0 12 0 81 0 0 0 1/2 0 1 0 0 3 1 120 1 0 0 1/4 0 0 1 0 6 4 4

[P2|L2|U2] =

0 0 1 0 1 0 0 0 12 4 4 4

0 0 0 1 0 1 0 0 0 12 0 81 0 0 0 1/2 1/4 1 0 0 0 1 100 1 0 0 1/4 1/2 0 1 0 0 4 8

[P3|L3|U3] = 0 0 0 1 1 0 0 0 12 4 4 40 0 1 0 0 1 0 0 0 12 0 81 0 0 0 1/4 1/2 1 0 0 0 4 8

0 1 0 0 1/2 1/4 0 1 0 0 1 10


45/124

AEM 1- 36

[P3|L3|U3] = 0 0 0 1 1 0 0 0 12 4 4 4

0 0 1 0 0 1 0 0 0 12 0 81 0 0 0 1/4 1/2 1 0 0 0 4 80 1 0 0 1/2 1/4 1/4 1 0 0 0 8

A = P LU = P3L3U3 with

P = 0 0 0 1

0 0 1 01 0 0 0

0 1 0 0

L = 1 0 0 0

0 1 0 01/4 1/2 1 01/2 1/4 1/4 1

U = 12 4 4 4

0 12 0 80 0 4 80 0 0 8

1.4.9 Solving a Linear Equation System

Ax = b with A =

6 5 3

10

3 7 3 512 4 4 4

0 12 0 8

and b = 10

148

8

m1 Solve P z = b

z=

PT b=

0 0 1 0

0 0 0 1

0 1 0 01 0 0 0

1014

88

= 8

8

1410

.m2 Solve Ly = z, i.e.

1 0 0 0

0 1 0 01/4 1/2 1 01/2 1/4 1/4 1

y1y2y3y4

=

8

814

10

.

Line by line one has y1 = 8, y2 = 8, 2 4 + y3 = 14 y3 = 16and 4 2 4 + y4 = 10 y4 = 8.


46/124

AEM 1- 37

m3 Solve Ux = y, i.e.

12 4 4 4

0 12 0 80 0 4 80 0 0 8

x1

x2x3x4

= 8

8168

.line by line one has

8x4 = 8 x4 = 1, 4x3 + 8 = 16 x3 = 2, 12x2 8 =8 x2 = 0 and 12x1 8 + 4 = 8 x1 = 1, sox = 1 0 2 1

T

.

1.4.10 Short Form

(i) Use the zeroes in the U-matrix to store the elements below the

diagonal of the L-matrix.

Divide these areas of the U-matrix by a line.

(ii) Instead of the P-matrix use a vector (initiallyp = [1 2 3 4]T) con-

taining the numbers of the rows of the right-side vector b.

Then a pivoting operation results in exchanging whole rows in U and p.

1.4.11 Example

[P0|L0|U0] = [E|E|A] :

6 5 3 103 7 3 5

12 4 4 4

0 12 0 8

1

2

3

4

.

[P1|L0|U0] : 12 4 4 4

3 7 3 56 5 3 100 12 0 8

3

21

4


47/124

AEM 1- 38

[P1|L1|U1] : 12 4 4 41

/4 6 4 41/2 3 1 120 12 0 8

3

21

4

[P2|L1|U1] :

12 4 4 4

0 12 0 81/2 3 1

12

1/4 6 4 4

3

4

1

2

[P2|L2|U2] =

12 4 4 4

0 12 0 81/2 1/4 1 101/4 1/2 4 8

3

4

1

2

[P3|L3|U3] :

12 4 4 4

0 12 0 81/4 1/2 4 81/2 1/4 1 10

3

4

2

1

[P3|L3|U3] =

12 4 4 40 12 0 81/4 1/2 4 81/2 1/4 1/4 8

34

2

1

Decompose this and put the L- and U-parts into the right form:

L =

1 0 0 00 1 0 01/4 1/2 1 01/2 1/4 1/4 1

und U =12 4 4 40 12 0 80 0 4 8

0 0 0 8


48/124

AEM 1- 39

In z = PT b one has

b = b1

b2b3b4

= 10

148

8

so z = b3

b4b2b1

= 8

814

10

and the rest is as above.

If one wants P explicitly one has from p: P = [ e3, e4, e2, e1].


49/124

AEM 1- 40

1.5 Eigenvalues and Eigenvectors

1.5.1 Definition and propertiesLet A be a square matrix.

(i) If C and v = 0 is a vector with Av = v, then v is calledeigenvector of A to the eigenvalue .

(ii) It is Av = v with v = 0 there is a vector v = 0 with (A E)v = 0 the kernel of A E is non-trivial A E is not regular det(A E) = 0As det(A E) is a polynomial of degree n in , we definep() = det(A E) is called characteristic polynomial von A.

Therefore a (complex) number is an eigenvalue of A, if is azero of the characteristic polynomials.

(iii) A has at least one eigenvalue and at least one eigenvector to each

eigenvalue.

(iv) If is a k-fold zero of p, then o() = k is called the algebraic

multiplicity of .

The geometric multiplicity () is the dimension of the kernel ofA E, that is dimension of eigenspace of A and .(v) A vector v is called generalized eigenvector of the k-th order to ,

if the following holds:

(A E)kv = 0, but (A E)k1v = 0.

(vi) Because of (A

E)0v = Ev = v the eigenvectors are just the

generalized eigenvectors of first order. If v is a generalized eigen-

vector of k-th order then (A E)v is a generalized eigenvectorof order (k 1).


50/124

AEM 1- 41

1.5.2 More properties

(i) Let C = P AP1. Then A and C have the same characteristicpolynomial.

(ii) Ifv is a (generalised) eigenvector of A then P v ist a (generalized)

eigenvector of C (of the same order).

(iii) Let A be a square kk-matrix with the property that the diagonaland everything below the diagonal is zero.

Then Ak = 0.

(iv) Let A be an (upper or lower) triangular matrix. Then the eigen-

values of A are the diagonal elements.

This shows that eigenvalues are properties of the linear map rather than

of the representing matrix.

1.5.3 Lemma

Let C be a m m-matrix. Then there exists an invertible m m-matrixP so that

S = P1CP =

0 ...

......

0

where is an eigenvalue of C.

1.5.4 Theorem: Schur Form

Let A be a n

n-matrix. Then there exists an invertible matrix P and

an upper triangular matrix U with A = P UP1.

U has the same characteristic polynomial as A, so the diagonal ofU are

the eigenvalues of A with the same multiplicity.


51/124

AEM 1- 42

1.5.5 Consequences

(i)

Always 1

()

o()

n holds.

If () < o() then for sufficient large k the dimension ofthe kernel of (A E)k is equal to the algebraic multiplicityo().

(ii) The generalized eigenspace to is the span of all generalized

eigenvectors to . Its dimension is o(), i.e. there are in total

as many linearly independent generalized eigenvectors to as the

order of as a zero of the characteristic polynomial.In particular for a simple zero of the characteristic polynomial we

have: there is a one-dimensional eigenspace and there are no gen-

eralized eigenvectors of higher order.

(iii) (generalized) eigenvectors to distinct eigenvalues are linearly inde-

pendent.

(iv) A real matrix is called (real) diagonalisable, if(1) the characteristic polynomial has only real zeroes

(2) for each zero the algebraic and the geometric multiplicity are

equal.

This means that there is a basis of theRn consisting of eigenvectors

of A resp. that there are no generalized eigenvectors of higher

order.

(v) Accordingly a complex matrix is called complex diagonalisable if

for every eigenvalue the algebraic and geometric multiplicity are

the same.

(vi) The spectrum of A is the set of eigenvalues, denoted by (A).

1.5.6 Jordan-FormIf is an eigenvalue of the matrix A and v is a corresponding eigenvector,

then Av = v.


52/124

AEM 1- 43

If v is a generalized eigenvector of order k+ 1 then u = (A E)v is ageneralized eigenvector of order k. In this case we have Av = v + u.

Putting these two cases together we get the important theorem on theJordan-form of a matrix:

1.5.6.1 Jordan-Form

Let L be an endomorphism ofCn. Then there exists a basis ofCn so

that in this basis L has an block-matrix representation

J =

J1 0 00 J2

. . ....

.... . .

. . . 0

0 0 Jp

where Jr =

r 1 0 00 r 1

. . ....

.... . .

. . .. . .

......

. . . r 1

0

0 r

The numbers r are (not necessarily distinct) eigenvalues. The blocks

Jr are Jordan-blocks.

If Jr has the size k and u1 uk are the basis vectors associated to theblock Jr then we have

Lu1 = ru1 and for 2 s k we have Lus = us + us1. ()

That means that u1 is an eigenvector and us are generalized eigenvectorsof order s. The (ordered) set u1 uk alle called Jordan-chain.Now let u1,1 u1,k1, u2,1 u2,k2, up,1 up,kp be the Jordan chainsassosiated with the Jordan blocks J1 Jp. The matrix

U = [u1,1 u1,k1 up,1 up,kp]fulfills

AU = UJ A = UJU1 J = U1AU.This is easily seen be looking at the columnvectors in the products,

because this is just the equation () in each column.


53/124

AEM 1- 44

1.5.6.2 Remark

If each Jr has the size 1 then there exists a basis of eigenvectors and

there are no generalized eigenvectors of order greater than one. In thiscase the matrix is diagonalisable.


54/124

AEM 1- 45

1.5.7 Example

A :=

2 0 1 0 0 0 0 0 0 0

1 2 0 0 0 0 0 0 0 00 0 2 0 0 0 0 0 0 0

0 0 0 2 0 0 0 1 0 0

0 0 0 0 2 0 0 0 1 0

0 0 0 1 0 2 0 0 0 0

0 0 0 0 0 0 2 0 0 0

0 0 0 0 0 0 0 2 0 0

0 0 0 0 0 0 0 0 2 00 0 0 0 0 0 0 0 0 2

p() = (2 )10, so 2 is 10-fold eigenvalue of A.

B :=

0 0 1 0 0 0 0 0 0 0

1 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 1 0

0 0 0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

B2 =

0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0 0

Furthermore B3 = 0.


55/124

AEM 1- 46

U3

U2

U1

s3 = 2

s2 = 3

s1 = 5

r3 = 10 r2 = 8 r1 = 5 r0 = 0

ker B3 ker B2 ker B1 ker B0

One has

v ker B Bv = 0 B1(Bv) = 0 Bv ker B1.

So B is injective between U3, U2 and U1.

b31

b32 b22

b21

b23 b13

b11

b12

b14

b15

B B

U3 U2 U1

Choose a basis b31 and b32 of U3.

From this define

(i) Bb31 = b21, Bb21 = b11 and Bb11 = 0. (Jordan chain of length 3)

(ii) Bb32 = b22, Bb22 = b12 and Bb12 = 0. (Jordan chain of length 3)

In the 3-dimensional space U2 the vectors b21 and b22 are completed toa basis by b23.So one has

(iii) Bb23 = b13, Bb13 = 0. (Jordan chain of length 2)


56/124

AEM 1- 47

In the end the vectors in U1 that are already determined are completed

to a basis.

(iv) Bb14 =

0 (Jordan chain of length 1)

(v) Bb15 = 0 (Jordan chain of length 1)

With this the map B is uniquely described in the basis bi j.

If one observes Bv = 0 (A I)v = 0 Av = vBv = w (A I)v = w Av = v + w ,one has with

b11, b21, b31,b12, b22, b32,b13, b23,b14 andb15

the following matrix representation of A

J :=

1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 1 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 0 0

0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0

0 0 0 0 0 0 0 0 0

J is the (better a) Jordan form of the map A.

Gather the vectors b11, b21, . . . , b15 in a matrix C. Then it follows

AC = CJ, so A = CJC1.


57/124

AEM 1- 48

Calculation with numbers

U1 is the kernel of B. It consists of all vectors having a zero in position

1, 3, 4, 8 and 9. Because in general there is no canonical choice of baseswe describe U1 as

U1 = [e2 e5, e2 + e5, e6 e2, e7 e2, e10 e2.]

The kernel of B2 consists of all vectors having a zero in position 3 and

8. So U1 is completed by

U2 = [e1 + e4, e1 e4, e1 + e9]to a basis of ker B2.

ker B3 consists of all vectors. So we choose

U3 := [e3, e8].

Now construct the Jordan chains:

Be3 = e1, Be1 = e2 Be2 = 0 these are b31, b21 and b11

Be8 = e4, Be4 = e6 Be6 = 0 these are b32, b22, and b12

These are the chains of lenghth 3.

In U2 we have to complete the images of the vectors of U3 (e1 and e4)to a basis. So we choose b23 = e1 + e9 and build the next Jordan chain

B(e1 + e9) = e2 + e5, B(e2 + e5) = 0 these are b23 and b13

In U1 the span of e2, e6 and e2 + e5 has to be completed to a basis.

Therefore we choose

b14 = e10 e2 and b15 = e7 e2.


58/124

AEM 1- 49

With this we have: in the basis b11, b21, b31, b12, b22, b32, b13, b23,b14und b15 A the form J stated above.

Here we have C = (e2, e1, e3, e6, e4, e8, e2 + e5, e1 + e9, e10 e2, e7 e2)and so

C =

0 1 0 0 0 0 0 1 0 0

1 0 0 0 0 0 1 0 1 10 0 1 0 0 0 0 0 0 0

0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 1 0 0 0

0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1

0 0 0 0 0 1 0 0 0 0

0 0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 0 1 0

und

C1 =

0 1 0 0

1 0 1 0 0 1

1 0 0 0 0 0 0 0 1 00 0 1 0 0 0 0 0 0 0

0 0 0 0 0 1 0 0 0 0

0 0 0 1 0 0 0 0 0 0

0 0 0 0 0 0 0 1 0 0

0 0 0 0 1 0 0 0 0 0

0 0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 1 0 0 0

The Jordan theorem now tells that A = CJC1 and J = C1AC.


59/124

AEM 1- 50

1.5.7.1 Algorithm

We look for the Jordan form and transformation matrices of an endo-

morphism A on Rn (or Cn), so A = CJC1.

(i) Calculate p() = det(A E) and find all zeroes. These are theeigenvalues.

(ii) For each eigenvalue perform the following process:

m1 For construct B := A E and determine the spaces Ui,until the dimension of the kernel of Bk (this is equal to thesum of the dimensions of the Ui) is equal to the algebraic

multiplicity of .

This is done iteratively: first find (with aid of the Gaua-

algorithm) a basis of the kernel of B. This is U1.

Then compute B2 and find a basis of its kernel by completing

the basis of U1 by other vectors. These completing vectorsform a basis of U2.

Now find a basis of U3 by completing the basis of ker B2 by

some vectors to a basis of ker B3 and so on.

m2 Now construct the Jordan chains:

the basis ofU3 (in general: Uk with the highest k) is mapped

by B in U2; and then is completed to a basis ofU2 by vectorsthat have been computed in m1 .

This basis is mapped by B; and the images are completed to

a basis of U1.

Each j-tuple v, Bv, . . . Bj1v of basis vectors with a startingvector v

Uj forms a Jordan chain of length j.

m3 When in total basis vectors are found, the work is donefor this eigenvalue.


60/124

AEM 1- 51

(iii) Each Jordan chain v, Bv, . . . Bj1v is written down in reverseorder (so starting with the eigenvector) Bj1v, Bj2v, . . . v andgathered to the matrix C.

In the Jordan matrix J each chain corresponds to a Jordan block

of size j j having the form J( j,) =

1 0 00 1 0...

.... . .

. . ....

0 0 10 0 0

with

the eigenvalue .The Jordan matrix J is the a block diagonal matrix consisting of

the single Jordan blocks.

1.6 Special Properties of Symmetric Matrices

A matrix is called orthogonal, iff the columns form an orthonormal basis.Equivalently one can say

AT = A1 or AT A = A AT = En.

In the complex case a matrix is called unitary if

A = A1 or A A = A A = En.

The importance of these notions lies in the fact that for arbitrary vectors

v and w and an orthogonal or unitary matrix A the following holds:

Av = v and < Av , A w >=< ATAv , w >=< v , w >An orthogonal transformation doesnt change neither angles nor lengths.

The proof of these facts is given below,

This subsection contains facts about symmetric or hermitian matrices.

Recall that a real matrix is called symmetric if A = A and a complexhermitian if A = A. For real matrices these definitions coincide.

The following statements are formulated for the complex case, because

the (more important) real case is contained in it.


61/124

AEM 1- 52

1.6.1 Properties of Symmetric and Hermitian Matrices

Let A be a hermitian n

n-matrix.

(i) The eigenvalues of A are real.

(ii) If = are eigenvalues and v1 and v2 are eigenvectors to resp., then < v1, v2 >= 0.

(iii) For each eigenvalue the geometrical and the algebraic multiplicity

are equal.

(iv) There exists a ON-Basis of eigenvectors ofA(v) There is an unitary matrix U and a real diagonal matrix D with

A = UDU. (Remember: U unitary U = U1.)

1.6.2 Orthogonal Matrices

A square matrix is called orthogonal (or unitary in the complex case)if ATA = E resp. AA = E. As the real case is more important, werestrict our further results to this case. The complex case can be proved

analogously.

1.6.2.1 Properties of Orthogonal Matrices

The following statements are equivalent:

(i) A is orthogonal.

(ii) AT = A1.

(iii) The columns of A form an orthonormal basis.

(iv) The rows of A form an orthonormal basis.

(v) For v, w Rn we have < v , w >=< Av , A w >.(vi) for each v Rn we have Av = v.


62/124

AEM 1- 53

1.6.2.2 Further Properties

Let A be orthogonal.

(i) For v , w one has


63/124

AEM 1- 54

(ix) Let vi and vj be elements of the ON-Basis of the eigenspace of

ATA to = 0. Then we have

i j = < vi, vj >=< vi, vj >=< AT

Avi, vj >=< Avi, Avj > .

This shows that Avi forms an orthogonal system and hence the

dimension of the eigenspace of AAT to must be greater or equal

than the dimension of the corresponding eigenspace of ATA.

By symmetry it follows that these two numbers are equal.

1.7.2 Existence and Construction of the SVD

1.7.2.1 Theorem

Let A be a m n-matrix.Then there exists an orthogonal nn-matrix V and an orthogonal mmmatrix U, and a m

n-matrix S = (si j) with si i

0 so that

A = USVT.

The matrix S = (si j) is a matrix of diagonal type, i.e. for i = j one hassi j = 0.

1.7.2.2 Algorithm

m1 Form B = ATA. This is an n n-matrix.m2 Compute the eigenvalues of B. These are non-negative and are

numered in the sequence 1 2 k > k+1 = =n = 0. The fact that k is the rank of the matrix A ( and the rank

of ATA too) can be used as a crosscheck.

m3 Find an ON-basis v1, . . . , vn ofRn. here is vi eigenvector to theeigenvalue i. V := [v1, , vn] becomes an orthogonal matrix.(VT = V1).


64/124

AEM 1- 55

m4 The singular values of A are defined as si =

i. The matrix

S = (si j) is a matrix of diagonal type, i.e. for i = j one has si j = 0.S has the same shape as A, i.e. n columns and m rows. The

elements in the diagonal are given by the singular values: si i = si.

m5 For i k define the vectors ui = 1iAvi. They form an orthonor-mal system. Complete these vectors to an ON-basis u1, . . . , um of

Rm and gather them into the matrix U = [u1, , um].m6 The singular value decomposition of A is

A = USVT.

1.7.2.3 Remark

In many cases the vectors in V and U belonging to the eigenvalues zero

are not needed. In this case the entries are denoted by stars (

) and

are not explicitely calculated. This is called the simplified version of theSVD.


If A = USVT is the SVD of A then AT has the SVD AT = V S TUT. If A

is invertible, then A1 = V S1UT.

1.8 Generalized Inverses

The singular value decomposition can be used to construct approximate

solutions of (possibly) non-square linear equation systems.

Given a mn-matrix A and an vector b Rm we are looking for a vectorx Rn so that the norm

Ax b2 = min!


65/124

AEM 1- 56

Substituting the SVD of A and remembering that for the orthogonal

matrix U we have that UT = U1 is orthogonal, too with u = UT ufor each u

Rm we get

Ax b2 = USVTx b2 = UTUSVTx UTb2

= S VTxz

UTbd

2 ()

The solutions of this equation are given by

zj = 1sj

dj j = 1 . . . k

arbitrary j > k

As V is orthogonal we get all solutions x as

x = V z =

kj=1

1

sjdjvj +

nj=k+1

zjvj.

Because V is orthogonal, the norm of x is given by n

j=1

z2j 1/2

. There-

fore the solution with the smallest norm is

x+ = V z =

kj=1

1

sjdjvj.

This solution is called pseudo-normal solution.One sees that the mapping b x+ is given by the matrix A+ := VSUTwith the diagonal-type matrix S := (ii j) where i is defined by i = 1

sjfor j k

0 for j > k.

1.8.0.5 Definition

The so defined matrix A+ is called generalized inverse or

Moore-Penrose-inverse of A.


66/124

AEM 1- 57


We have (AT)+ = (A+)T.

1.8.1 Special case: A injectiv

If A is injective then A has the rank n and the pseudo-normal solution

of every equation Ax = b is unique. Furthermore in this case ATA is

invertible (because of rank A = n and the rank of ATA is equal to the

rank of A).

In this case we can calculate x+ without explicit construction of the

SVD: using ATA = V S TUTUSV = V STSV we get from the equation

() above:

SVTx+ = UT b V S TSVTx+ = V S TUT b

V ST

SVT

ATA

x+

= V ST

UT

ATb A

T

Ax+

= AT

b x+

= (AT

A)1

AT

b

So in this case

A+ = (ATA)1AT

If one wants only x+ it is sufficient to solve ATAx+ = ATb.


67/124

AEM 1- 58

1.9 Applications to linear equation systems

1.9.1 Errors1.9.1.1 Introductory example

Ax = b with A =

2 3 42 3 4.001

3 4 5

and b =

11

1

One easily sees that A is invertible. The solution x is uniquely deter-

mined:

The exact solution is x =

11

0

. On the other hand y =

0.50

0.5

is

not far from the solution because of Ay =

1

1.0005

1

. From this one

sees that the given equation system is very unstable with respect to

perturbations.

If one calculates the solution of the slightly perturbated system Ax1 = b1

with b1 =

1

0.9

1

one gets x1 =

101.0000201.0000

100.0000

.

1.9.1.2 Theorem

Let x be the solution of Ax = b. If we compare the solution x + x of

the disturbed system A(x+x) = b+bwith x, we get the relative error

x

x A

A1

b

b.

The number (A) = cond A = AA1 is called the condition of A.With a little more efford it is possible to prove


68/124

AEM 1- 59

Theorem If x is the solution of Ax = b and x + x the solution of

(A + A)(x + x) = b+ b, then the following estimate for the relative

error holds:

xx

(A)

1 (A)AA

bb +

AA

.

For small values ofA the right side is approximately equal to

(A)b

b + A

A .1.9.2 Numerical Rank Deficiency

Numerical rank deficiency appears if a matrix is close to another matrix

with smaller rank. This leads to a very large condition number.

Small variations in the initial data of Ax = b lead to large variation inthe result x.

The SVD of A is A = USVT with the singular values s1 10, s2 0.4and s3 1/3000.To avoid this effects one can proceed as follows:

m1 Decompose A = USVT.

m2 The matrix S1 is build out ofS by replacing all entries smaller thana given number by zero and A1 = US1V

T.

This is reasonable: one can prove that entries in S that are smaller

than the machine accuracy multiplied by the Frobenius norm of the

matrix will have no influence on the result.

m3 Instead of the solutions ofAx = bfind the pseudo-normal solutionsof A1x = b with

x+ = A+1 b = V S+1 U

T b


69/124

AEM 1- 60

In the example one has A = USVT with

S = 10.3873 0 0

0 0.3338 0

0 0 0.0003 and orthogonal matrices U and V.We change the third singular value to zero an get

S1 =

10.3873 0 00 0.3338 0

0 0 0

and S+1 =

0.0963 0 00 2.9961 0

0 0 0

.

Then A+

1

= 1.1633 1.1674 1.8314

0.1669

0.1676 0.3342

0.8316 0.8344 1.1662 andx+ = A+1

11

1

=

0.49920.0002

0.4997

and x+1 = A+1

10.9

1

=

0.38250.0165

0.4163

In the original problem we have

Ax+ =0.99980.9999

1.0001

and Ax+1 0.94990.94991.0003


70/124

AEM 1- 61

1.9.3 Application: Best Fit Functions

Other name: Gaua method of least squares

1.9.3.1 Most important case: best fit straight line

Starting point are n > 2 pairs of coordinates (xi, yi), so that at least two

different x-values occur.

We search for a line y = ax+bwith the property that the quadratic error

ni=1

(axi + b) yi

2is as small as possible.

The solution of this problem is the pseudo normal solution of

b+ ax1 = y1...

b+ axn = yn

, or A b

a = ywith A =

1 x1...

...

1 xn

and y =

y1...

yn

As the matrix is injective, the solution is obtained with aid of the trans-

posed matrix:

b

a

= (ATA)1ATy.

The coefficient of correlations r measures the quality of the approxima-

tion. Always we have

|r

| 1 and for r =

1 the line goes through all

points.


71/124

AEM 1- 62

Algorithm

All sums are from i = 1 to n.

m1 = n

x2i (

xi)2

m2 The best fit straight line y = ax + b has the coefficients

a =1

(n

xiyi

xi

yi)

and b = 1 x2i yi xiyi xi.

m3 r =n

xiyi

xi

yin

x2i (

xi)2

n

y2i (

yi)2

Second method

Find the mean values x =1

n

nk=1

xk and y =1

n

nk=1

yk. Shift the coordi-

nate system so that x and y are the new origin by replacing xk by xk xresp. yk by yk y. Then the best fit straight line is given by

y = v x with v =

nk=1

xk yk

nk=1

x2k

and

r =

n

k=1 xkykn

k=1

x2k

1/2 nk=1

y2k

1/2 = x , yx y .


72/124

AEM 1- 63

Here it is easy to see that the coefficient of correlation describes the

relative error in the approximation:

nk=1

(v xk yk)2

nk=1

y2k

= 1 r2.

1.9.3.2 General problem

Let (xi, yi), i = 1, . . . , n be n pairs of data. Furthermore let f1, . . . , f k be

k < n functions. We look for a linear combination f(x) =

kj=1

jfj(x)

of the fj so that the sum of the squares of the deviations of f(xi) to yibecomes minimal:

F =

ni=1

(f(xi) yi)2 =n

i=1

kj=1

jfj(xi) yi2 !

= min.

Solution: Solve Aa = y. Here a = (1, . . . , k)T contains the coeffi-

cients we look for and A = f1(x1) f2(x1) fk(x1)f1

(x2

) f2

(x2

)

fk

(x2

)...

.... . .

...

f1(xn) f2(xn) fk(xn)

and y = y1y2...

yn

.


73/124

AEM 1- 64

1.10 Symmetric Matrices and Quadratic

Forms

A quadratic form on Rn is a map of the form

x = (x1, . . . , x n)T Q(x) =

ni ,j=1

ci j xi xj

The ci j are real numbers with ci j = cj i. With the symmetric matrix

C = (ci j)i ,j=1...n this is written as

Q(x) = xTCx ,

On the other hand is Q the quadratic form that belongs to C .

Let C = UDUT with a real diagonal matrix D containing the eigenvalues

of C and an orthogonal matrix U. Then

QC(x) = xTCx = xTUDUTx = (UTx)TD(UTx)

If the columns of U are the (ON-)vectors u1 un, then UTx are thecoefficients of x in this basis. If these are denoted by y1, . . . , y n, then

with y = UTx one has

QC(x) = yTDy =

nk=1

ky2k.

From this one has immediately e.g. that QC(x) is positive for non-zero-

vectors iff all eigenvalues of C are positive.

This leads to the definition:

A quadratic form is called

positive definite

if Q(x) > 0 for x = 0 > 0 for all eigenvalues of C.


74/124

AEM 1- 65

positive semidefinite

if Q(x) 0 for all x 0 for all eigenvalues of C.

negative definiteif Q(x) < 0 for x = 0 < 0 for all eigenvalues of C.negative semidefinite

if Q(x) 0 for all x 0 for all eigenvalues of C.definite

if Q is negative or positive definite.

indefiniteif there are x and y with Q(x) < 0 < Q(y)

the matrix C has positive and negative eigenvalues.(Dangerous) notation: C positive definite: C > 0, C positive semidef-

inite: C 0, C negative (semi)definite: C < 0 (C 0).A symmetric matrix is called positive/negative (semi)definite or indefi-

nite, if this is true for the corresponding quadratic form.

Remark

A is positive [semi]definite A is negative [semi]definite.

Hurwitz Criterion

The Hurwitz Criterion is useful to determine the definiteness of a matrix

without calculation the eigenvalues.


75/124

AEM 1- 66

In the symmetric n n-matrix Aone forms - starting from the left

upper corner - submatrices of the

size 1, 2,... n. The determinantsof these submatrices are called

D1 to Dn. We have D1 = a11,

D2 = a11a22 a12a21. Dn is thedeterminant of A at last. Then

the following holds:

a11

a21

a31

an1 ann

a1na12 a13

a22 a23

a33a32

.... . .

(i) D1 > 0, D2 > 0, D3 > 0, D4 > 0 etc.

A pos. definite.

(Dk > 0)

(ii) D1 < 0, D2 > 0, D3 < 0, D4 > 0 etc. A neg. definite.((1)kDk > 0)

(iii) A pos. semidefinite D1 0, D2 0, D3 0, D4 0 etc.(Dk 0)

(iv) A neg. semidefinite

D1

0, D2

0, D3

0, D4

0 etc.

((1)kDk 0)(v) if neither iii) nor iv) holds, A is indefinite.

Especially A is indefinite, if for an even number k Dk < 0 holds. Please

pay attention to the fact that A may be indefinite even if always Dk 0or (1)kDk 0 holds. In this case at least one Dk has to be zero.

Quadratic Completion

Another possibility to determine the definiteness of a quadratic form is

quadratic completion. The method is explained at the example

Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 9z2.

m1 Choose one variable xj with a non-vanishing coefficient ofx2j . Herewe choose x. If such a choice is impossible, the quadratic form is

indefinite.


76/124

AEM 1- 67

m2 Gather all terms that contain x:

Q(x) = (x2 + 4x y + 2x z) + (8y2 + 16y z + 9z2)

m3 Use the following to complete to a square(a + b+ c + d + )2 =a2+b2 +c2+d2 +2(ab+ac+ad+ +bc+bd+ +cd+ )

Q(x) = (x + 2y + z)2 +

m4 Subtract the term that are not contained in the bracket in stepm2 :Q(x) = (x + 2y + z)2 + (4y2 z2 4y z) + ( 8y2 + 16y z + 9z2)= (x + 2y + z)2 + (4y2 + 12y z + 8z2)

m5 Now the second bracket contains no x. Continue with m1 applied

to the second bracket. Choose y.

m6 Q(x) = (x + 2y + z)2 + (4y2 + 12y z) + 8z2

= (x + 2y + z)2 + (2y + 3z)2 9z2 + 8z2= (x + 2y + z)2 + (2y + 3z)2 z2.

This is a sum of squares with two plus and one minus-sign. This means

that there are two positive and one negative eigenvalues in the corre-

sponding matrix, and Q is indefinite.

Further examples

Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 10z2 is positive semidefinite,

and

Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 11z2 is positive definite.


77/124

AEM 1- 68

1.11 QR-Decomposition

Theorem

Let A be a matrix with m rows and n m columns. Then there existsa orthogonal matrix Q and a upper triangular matrix R with A = QR.

Upper triangular means that for R = (ri j) one has ri j = 0 for j < i.

Proof 1 - Jacobi method, Givens rotations

The case n = 1 or m = 1 is trivial. Now let us first look at the case

m = 2.

We are looking for an orthogonal 2 2-matrix Q with A = QR andr21 = 0.

Q = u vv u with u

2 + v2 = 1, R = r11 r12 0 r22

and A =

a b c d

leads to

QTA = R

u v

v u

a b c d

=

r11 r12 0 r22

av + uc = r11 r12 0 r22 .So this can be fulfilled with

u =a

a2 + c2and v =

ca2 + c2

In the case c = 0 one simply takes Q = E2.


78/124

AEM 1- 69

With Q0 = E and R0 = A for each element below the diagonal an

operation is performed:

QTi Ri = Ri+1

. . . 0 a b ... ... c d 0

Ek 0 00 u 0 v

.

..... 0 Em 0

...... v 0 u ...0 Ep

. . .

0 r ... ... 0 0

From this one sees: the same values of u and v as above eliminate the

c- element with an orthogonal matrix Qi, and the rest of the column

that contains a and c is not changed.

So we have A = Q0R0 = Q0R0 = Q0Q1R1 = Q0 QkRk := QR withQ = Q0 Qk and R = Rk.

Proof 2 - Householder Transformations

The Jacobi method needs n2

2 steps. This method uses only n 1steps:

The idea is to use a series of reflexions that map the parts of the columns

below the diagonal to zeroes.

After some steps we habe the matrix

Rk =

. . .

0 | ... bk 0 |


79/124

AEM 1- 70

The lower part of column k, bk, shall be mapped onto a multiple of

ek. Let ck be a vektor equal to bk, but with zeroes in the first k 1

positions. So defi

a em theory

Documents