a em theory

Upload: lm

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/2/2019 A Em Theory

    1/124

    Contents

    0 Solving Linear Equation Systems with the Gauss-Algorithm 6

    1 Linear Algebra and Vector Spaces 1

    1.1 Vector spaces . . . . . . . . . . . . . . . . . . . . . . . 1

    1.1.1 Vector Spaces . . . . . . . . . . . . . . . . . . . 1

    1.1.2 Linear Independence . . . . . . . . . . . . . . . 2

    1.1.3 Dimension and Basis . . . . . . . . . . . . . . . 3

    1.1.4 Scalar Product . . . . . . . . . . . . . . . . . . 5

    1.1.5 Orthonormal Systems . . . . . . . . . . . . . . . 6

    1.1.6 Norms . . . . . . . . . . . . . . . . . . . . . . . 8

    1.2 Matrices and Linear Maps . . . . . . . . . . . . . . . . 9

    1.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . 9

    1.2.2 Linear Maps . . . . . . . . . . . . . . . . . . . . 12

    1.2.3 Linear Equations . . . . . . . . . . . . . . . . . 14

    1.2.4 Inverse map and Inverse Matrix . . . . . . . . . 15

    1.2.5 Changing the Basis . . . . . . . . . . . . . . . . 17

    1.2.6 Some Special Linear Maps in R2 . . . . . . . . . 18

    1.2.7 Examples . . . . . . . . . . . . . . . . . . . . . 19

    1.3 Operations with matrices . . . . . . . . . . . . . . . . . 19

    1.3.1 Matrix-algebra . . . . . . . . . . . . . . . . . . 20

    1.3.2 Scalar Product . . . . . . . . . . . . . . . . . . 21

    1.3.3 Homogeneous Coordinates . . . . . . . . . . . . 21

    1.3.4 Norms . . . . . . . . . . . . . . . . . . . . . . . 22

    1.4 Gauss Algorithm and LU-Decomposition . . . . . . . . . 241.4.1 Numerical Stability . . . . . . . . . . . . . . . . 24

    1.4.2 Special Operations . . . . . . . . . . . . . . . . 26

  • 8/2/2019 A Em Theory

    2/124

    AEM 0- 2

    1.4.3 Properties ofC(k, l; ), D(k; ) and F(k, l) . . 27

    1.4.4 Standard Algorithm . . . . . . . . . . . . . . . . 27

    1.4.5 LU-Decomposition . . . . . . . . . . . . . . . . 28

    1.4.6 Example . . . . . . . . . . . . . . . . . . . . . . 311.4.7 Summary of LU-decomposition . . . . . . . . . . 33

    1.4.8 Example of LU-Decomposition . . . . . . . . . . 34

    1.4.9 Solving a Linear Equation System . . . . . . . . 36

    1.4.10 Short Form . . . . . . . . . . . . . . . . . . . . 37

    1.4.11 Example . . . . . . . . . . . . . . . . . . . . . . 37

    1.5 Eigenvalues and Eigenvectors . . . . . . . . . . . . . . . 40

    1.5.1 Definition and properties . . . . . . . . . . . . . 401.5.2 More properties . . . . . . . . . . . . . . . . . . 41

    1.5.3 Lemma . . . . . . . . . . . . . . . . . . . . . . 41

    1.5.4 Theorem: Schur Form . . . . . . . . . . . . . . 41

    1.5.5 Consequences . . . . . . . . . . . . . . . . . . . 42

    1.5.6 Jordan-Form . . . . . . . . . . . . . . . . . . . 42

    1.5.7 Example . . . . . . . . . . . . . . . . . . . . . . 45

    1.6 Special Properties of Symmetric Matrices . . . . . . . . 511.6.1 Properties of Symmetric and Hermitian Matrices 52

    1.6.2 Orthogonal Matrices . . . . . . . . . . . . . . . 52

    1.7 Singular Value Decomposition (SVD) . . . . . . . . . . 53

    1.7.1 Preparations . . . . . . . . . . . . . . . . . . . 53

    1.7.2 Existence and Construction of the SVD . . . . . 54

    1.8 Generalized Inverses . . . . . . . . . . . . . . . . . . . . 55

    1.8.1 Special case: A injectiv . . . . . . . . . . . . . . 571.9 Applications to linear equation systems . . . . . . . . . 58

    1.9.1 Errors . . . . . . . . . . . . . . . . . . . . . . . 58

    1.9.2 Numerical Rank Deficiency . . . . . . . . . . . . 59

    1.9.3 Application: Best Fit Functions . . . . . . . . . 61

    1.10 Symmetric Matrices and Quadratic Forms . . . . . . . . 64

    1.11 QR-Decomposition . . . . . . . . . . . . . . . . . . . . 68

    1.12 Numerics of eigenvalues . . . . . . . . . . . . . . . . . . 71

  • 8/2/2019 A Em Theory

    3/124

    AEM 0- 3

    2 Ordinary Differential Equations 2

    2.1 General Definitions . . . . . . . . . . . . . . . . . . . . 2

    2.2 Linear differential equations with constant coefficients . 4

    2.2.1 Inhomogeneous Equations . . . . . . . . . . . . 72.3 Linear differential equations of higher order . . . . . . . 8

    2.3.1 General Case . . . . . . . . . . . . . . . . . . . 8

    2.3.2 Ode with Constant Coefficients . . . . . . . . . 10

    2.3.3 Special Inhomogeneities . . . . . . . . . . . . . 11

    3 Calculus in Several Variables 3

    3.1 Differential Calculus in Rn . . . . . . . . . . . . . . . . 33.1.1 Definitions . . . . . . . . . . . . . . . . . . . . 3

    3.1.2 Examples and Properties of Open and Closed Sets 4

    3.1.3 Main Rule for Vector-Valued Functions . . . . . 4

    3.1.4 Definition - Limits and Continous Fuctions . . . 5

    3.1.5 Definition - Partial Derivatives . . . . . . . . . . 5

    3.1.6 Theorem of H.A. Schwarz . . . . . . . . . . . . 6

    3.1.7 Definition: Derivative off . . . . . . . . . . . . 63.1.8 Higher derivatives . . . . . . . . . . . . . . . . . 7

    3.1.9 Examples . . . . . . . . . . . . . . . . . . . . . 7

    3.1.10 Directional derivative, Gfteaux derivative . . . . 7

    3.1.11 Rules . . . . . . . . . . . . . . . . . . . . . . . 8

    3.2 Inverse and Implicit Functions . . . . . . . . . . . . . . 8

    3.2.1 Inverse Function Theorem . . . . . . . . . . . . 8

    3.2.2 Application: Newtons method . . . . . . . . . . 93.2.3 Implicit Function Theorem . . . . . . . . . . . . 9

    3.3 Taylor Expansions . . . . . . . . . . . . . . . . . . . . . 10

    3.3.1 Nabla-Operator . . . . . . . . . . . . . . . . . . 10

    3.3.2 Construction of Taylor Expansions . . . . . . . . 10

    3.3.3 Taylors Theorem . . . . . . . . . . . . . . . . . 11

    3.3.4 Calculation in the two-dimensional Case . . . . . 12

    3.4 Extreme Values . . . . . . . . . . . . . . . . . . . . . . 143.4.1 Definition . . . . . . . . . . . . . . . . . . . . . 14

    3.4.2 Neccesary Criterion . . . . . . . . . . . . . . . . 15

  • 8/2/2019 A Em Theory

    4/124

    AEM 0- 4

    3.4.3 Sufficient Criterion . . . . . . . . . . . . . . . . 15

    3.4.4 Saddle Points . . . . . . . . . . . . . . . . . . . 1

    4 Integral Transforms 24.1 Laplace Transform . . . . . . . . . . . . . . . . . . . . 2

    4.1.1 Method of Calculation . . . . . . . . . . . . . . 2

    4.1.2 Convolution . . . . . . . . . . . . . . . . . . . . 4

    4.1.3 Some important Examples . . . . . . . . . . . . 4

    4.1.4 Solution of Inital Value Problems . . . . . . . . 5

    4.2 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . 6

    4.2.1 Theorem . . . . . . . . . . . . . . . . . . . . . 64.2.2 Definition . . . . . . . . . . . . . . . . . . . . . 6

    4.2.3 Theorem . . . . . . . . . . . . . . . . . . . . . 7

    4.2.4 Properties of the Coefficients . . . . . . . . . . 7

    4.2.5 Real form of the Fourier Series . . . . . . . . . . 7

    4.3 Fourier Transform . . . . . . . . . . . . . . . . . . . . . 8

    4.3.1 Definition . . . . . . . . . . . . . . . . . . . . . 8

    4.3.2 Inverse Transform . . . . . . . . . . . . . . . . . 84.3.3 Convolution . . . . . . . . . . . . . . . . . . . . 9

    4.3.4 Rules . . . . . . . . . . . . . . . . . . . . . . . 9

    4.3.5 Sine and Cosine transform . . . . . . . . . . . . 10

    4.3.6 More Properties . . . . . . . . . . . . . . . . . . 10

    4.3.7 Calculation of the Fourier Transform . . . . . . 11

    4.3.8 Gauss functions . . . . . . . . . . . . . . . . . . 11

    4.3.9 Consequences . . . . . . . . . . . . . . . . . . . 124.3.10 Definition: Dirac sequence . . . . . . . . . . . . 1

    4.3.11 Main Property of Dirac sequences . . . . . . . . 1

    4.3.12 Delta Distribution . . . . . . . . . . . . . . . . . 1

    5 Stability of Ordinary Differential Equations 2

    5.1 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 2

    5.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . 35.3 Flow-box theorem . . . . . . . . . . . . . . . . . . . . . 3

    5.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . 3

  • 8/2/2019 A Em Theory

    5/124

    AEM -1- 5

    5.5 Theorem: Linear Case . . . . . . . . . . . . . . . . . . 4

    5.6 Linearisation . . . . . . . . . . . . . . . . . . . . . . . . 4

    5.7 PoincarS-Ljapunov Theorem . . . . . . . . . . . . . . . 4

    5.8 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 45.9 Ljapunov Functions . . . . . . . . . . . . . . . . . . . . 1

    5.9.1 Definition . . . . . . . . . . . . . . . . . . . . . 1

    5.9.2 Theorem . . . . . . . . . . . . . . . . . . . . . 1

  • 8/2/2019 A Em Theory

    6/124

    0 Solving Linear EquationSystems with theGauss-Algorithm

    A linear equation system with m equations and n unknowns is given by

    a11x1 + a12x2 + a1nxn = b1...

    am1x1 + am2x2 +

    amnxn = bm

    Omitting the plus-signs and the variables this will be written down in the

    short form a11 a12 a1n b1... ... ... ...

    am1 am2 amn bm

    In case of an homogeneous equation system (all bj are equal to zero)the last column is omitted, too.

    Allowed operations are

    multiply a row with a number unequal to zero exchange two rows

    add a multiple of a row to another row The exchange of columns is only allowed is there is a row 0 added

    that contains the names of the variables.

  • 8/2/2019 A Em Theory

    7/124

    AEM 0- 7

    Naturally the last column containing the bj-values must not be

    exchanged with other columns.

    The simplest form of the Gauss-Algorithm is to perform these steps:m1 Try to get a 1 into the upper left corner. If this is not possible,the algorithm stops.

    m2 by adding suitable multiples of the first row to the rows below (andabove) generate zeroes in the rest of the column.

    m3 Repeat the process in the subscheme without the first row andcolumn.

    In the end (possibly after exchanging rows and columns) one has

    xj1 xj2 xjk xjk+1 xjn1 0 0 c10 1 0 c2...

    ..

    .

    . ..

    ..

    .

    ..

    .

    ..

    .

    ..

    .0 0 1 ck0 0 0 0 ck+1...

    ......

    ......

    0 0 0 0 cm

    The first row contains the names of the variables.

    The number k is called the rank of the equation system. The followingcases are possible:

    (i) At least one of the values ck+1, . . . , c m is unequal to zero. Then

    the system is not solvable.

    (ii) If k = n = m then the system is uniquely solvable with xj1 = c1,

    . . . , xjn = cn.

    (iii) It is k < n and ck+1 =

    = cm = 0. Then we can take the last

    n k variables xjk+1 to xjn as parameters in the solution. With thisthe values of xj1 to xjk are uniquely determined for each choice of

    the parameters.

  • 8/2/2019 A Em Theory

    8/124

    AEM 0- 8

    Example

    2x1 +6x2 +2x4 = 10

    x1 +3x2 +x3 +2x4 = 7

    3x1 +9x2 +4x3 = 16

    3x1 +9x2 +x3 +4x4 = 17

    or

    x1 x2 x3 x4

    2 6 0 2 101 3 1 2 7

    3 9 4 0 16

    3 9 1 4 17

    m1 Exchange rows 1 and 2.

    x1 x2 x3 x41 3 1 2 7

    2 6 0 2 103 9 4 0 16

    3 9 1 4 17

    m2 Add row 1 multiplied by (2) to row 2, multiplied by (3) to row3 and multiplied by 3 to row 4. This results in

    x1 x2 x3 x4

    1 3 1 2 70 0 2 2 40 0 1 6 50 0 2 2 4

    m3 Now swap columns 2 and 4.

    x1 x4 x3 x2

    1 2 1 3 70 2 2 0 40 6 1 0 50 2 2 0 4

    m4 Add row 2 to row 1, row 2 multiplied by (3) to row 3 and multi-

    plied by (

    1) to row 4. Then divide row 2 by (

    2).

  • 8/2/2019 A Em Theory

    9/124

    AEM 0- 9

    x1 x4 x3 x21 0 1 3 30 1 1 0 2

    0 0 7 0 70 0 0 0 0

    m5 Leave row 4 away, divide row 3 by 7 and add row 3 to row 1 and

    subtract it from row 2. Then we reach the final form

    x1 x4 x3 x21 0 0 3 4

    0 1 0 0 10 0 1 0 1

    m6 The system is solvable. The variables behind the columns that

    form an identity matrix are parameters; here this applies to x2.

    With x2 = t one sees x1 = 4 3t, x4 = 1 and x3 = 1. So we canwrite the general solution as follows:

    x1x2x3x4

    =

    4 3tt

    1

    1

    =

    4

    0

    1

    1

    + t

    31

    0

    0

  • 8/2/2019 A Em Theory

    10/124

    1 Linear Algebra and VectorSpaces

    1.1 Vector spaces

    1.1.1 Vector Spaces

    1.1.1.1 Definition

    A real vector-space (short: VS) is a set in which two operations additionand multiplication are defined, and where the following rules hold:

    u, v and w are elements of the vector-space, and are real numbers.

    (i) u+ v = v + u, u+ (v + w) = (v + u) + w

    (ii) There is a zero-vector 0 with v + 0 = v.

    (iii) For each v there is a vector

    v with v + (

    v) = 0.

    (iv) ( + )v = v + v , ()v = (v), (v + w) = v +

    w , 1 v = vIf one admits complex scalars, one gets a complex vector-space instead

    of a real VS.

    The elements of the vector-space are called vectors. The elements of

    the field R or C are called scalars.Often there is no difference wether one has R or C as field. In this case

    we use K as a symbol.

  • 8/2/2019 A Em Theory

    11/124

    AEM 1- 2

    1.1.1.2 Definition: Subspace

    Let V be a VS and U

    V. U is called a subspace of V, if U is itself a

    VS with the operations induced by V. This is fulfilled, iff (short for ifand only if) U contains for each pair of elements x and y the sum x+ y

    and all vectors of the form x with K.This property is called closedness against sums and multiplications.

    Always V has the trivial subspaces V and {0}.

    1.1.2 Linear Independence

    1.1.2.1 Definition

    Let v1 to vk vectors and 1, . . . , k K. The expression1v1 + kvk is called linear combination. The numbers j are calledcoefficients. Please notice that a linear combination is always a finitesum, even in infinite-dimensional spaces.

    The vectors v1 to vk are linearly dependent (l.d.), if there are coefficients

    1 to k with 1v1 + + kvk = 0, and not all of the i are zero. Ifthis is not the case, the vectors are called linearly independent (l.i.).

    Therefore, ifv1 to vk are linearly independent, and 1v1+

    +kvk = 0,

    it follows that 1 = 2 = = k = 0.On the other hand, if v1 to vk are l.d, then it is possible to write

    1v1 + + kvk = 0 with at least one of the j = 0, say 1 = 0. Thenone has

    v1 =11

    (2v2 + kvk),

    so one of the vectors is a linear combination of the others.

  • 8/2/2019 A Em Theory

    12/124

    AEM 1- 3

    1.1.2.2 Criteria for Linear Dependence

    A single vector is linearly dependent, iff it is the zero-vector.

    Two vectors uand v are linearly dependent, iff they lie on a straightline through zero; or iff one of them is a multiple of the other.

    Three vectors u, v and w are linearly dependent, iff they lie ina plane through zero; or if one of them is a linear combination

    of the others. In R3 there is a criterion with the volume of the

    parallelepiped spanned by these vectors

    v1, v2, v3 l.d. (v1, v2, v3) = det(v1, v2, v3) = 0

    k vectors v1 to vk are linearly dependent, iff the rank of the matrixwith the columns v1 to vk is less than k. (rank will be explained

    later).

    More than n vectors in Kn are always linearly dependent.

    Criterion for n vectors in Kn:v1 to vn are linearly dependent det(v1, . . . , vn) = 0.

    1.1.3 Dimension and Basis

    1.1.3.1 Definition: Span, Dimension and Basis

    Let V be a vector space.

    (i) Let M V be a (finite or infinite) non-empty subset of V. Theset of all linear combinations is called the span on M, span M =

    {m

    k=1kvk | j K, vj M}. The span is always a subspace.

    (ii) If there is a system M of n vectors in V, so that V is the span of

    M, and there is no such system consisting of less than n vectors,

    then V has the dimension n.

  • 8/2/2019 A Em Theory

    13/124

    AEM 1- 4

    If there is no finite set M with span M = V, V is said to be

    infinite-dimensional.

    (iii) A set M = {v1, v2 vn} V is called a basis ofV, iff every vectorv V has an unique representation v =

    nk=1

    kvk.

    1.1.3.2 Remarks

    (i) If V has dimension n, then every basis consists of n elements.

    (ii) If V has dimension n, then every linearly independent set of n

    vectors forms a basis.

    (iii) The elements of a basis are always linearly independent.

    1.1.3.3 Coordinates

    Let M = {v1, v2 vn} V be a basis of V. For each v V thereis a unique representation v =

    nk=1

    kvk. The numbers (1, . . . , n) are

    called the coordinates ofv with respect to M. The vector

    1...

    n

    (always

    a column!) is called the coordinate vector of v.

  • 8/2/2019 A Em Theory

    14/124

    AEM 1- 5

    1.1.4 Scalar Product

    1.1.4.1 Complex scalar product

    Let Vbe a complex vectorspace. A scalar product is a mapping VVC, (v , w) < v , w > with the properties

    (i) < u + v , w >= < u, w > + < v , w > for , C,u, v , w V (linearity in the first argument)

    (ii) < u, v + w >= < u, v > + < u, w > for , C,u, v , w V (anti-linearity in the second argument)

    (iii) < u, v >= < v , u >

    (iv) < u, u > 0 and < u, u >= 0 u = 0 (positive definiteness)Esp. the scalarproduct of a vector with itself is always real and

    non-negative.

    1.1.4.2 Real scalar product

    If V is a real vectorspace, the same properties shall hold with real valued

    scalar product, , R and (naturally) without complex conjugation.

    1.1.4.3 Standard scalar product

    The standard real resp. complex scalar product of two vectors in Kn is

    defined by

    v w =< v , w >:=n

    k=1

    vkwk v , w Rn

    v w =< v , w >:=n

    k=1

    vkwk v , w Cn

    In this case we define

  • 8/2/2019 A Em Theory

    15/124

    AEM 1- 6

    (i) u := < u, u > is the length or (euclidean) norm of the vectoru (also denoted by |u|).

    (ii) The angle [0, ] of two non-zero vectors u, v Rn

    is definedby cos =

    < u, v >

    u v .

    1.1.5 Orthonormal Systems

    With the Kronecker symbol i j = 1 i = j0 i = j we define1.1.5.1 Definition

    (i) Two vectors having scalar product zero are called orthogonal or

    perpendicular.

    (ii) A set of vectors {vi} with < vi, vj >= i j is called an orthonormalsystem (ONS). A basis that is an ONS is called orthonormal basis

    (ONB).

    1.1.5.2 Lemma

    ONS are linearly independent.

    The importance of ONB lies in the following theorem, which allows an

    expansion of a given vector in the basis with aid of scalarproducts:

    1.1.5.3 Expansion Theorem

    Let v1, , vk be an ONS, V the span of these vectors.

  • 8/2/2019 A Em Theory

    16/124

    AEM 1- 7

    (i) If u V, then the following holds:u =< u, v1 > v1+ < u, v2 > v2 + + < u, vk > vk

    =

    kj=1

    < u, vj > vj

    (ii) For V U and u U there exists a decomposition u = u1+u2 withu1 V and < u1, u2 >= 0. u1 is called the orthogonal projectionof u, and the map u u1 is the orthogonal projection onto V.

    1.1.5.4 Gram-Schmidt Orthonormalisation Process

    Let u1, , uk be a set of vectors, in which at least one non-zero vectorexists.

    m1 Choose u1 = 0, let v1 = u1 and set w1 = 1v1v1.

    m2 If w1 to wj1 are already constructed, letvj = uj < uj, w1 > w1 < uj, wj1 > wj1= uj

    j1i=1

    < uj, wi > wi.

    Then span {u1 . . . uj} = span {v1 . . . vj} and< vj, v1 >= =< vj, vj1 >

    =< vj, u1 >= =< vj, uj1 >= 0.In manual computations it is often easier to use the vi instead of

    the wi:

    vj = uj < uj, v1 >< v1, v1 >

    v1 < uj, vj1 >< vj1, vj1 >

    vj1

    = uj j1

    i=1< uj, vi >

    vi

    2

    vi.

    As the vj will be normed later, it is allowed to substitute the vj with

    a multiple. With this technique one can avoid sometimes the use

    of fractions.

  • 8/2/2019 A Em Theory

    17/124

    AEM 1- 8

    m3 If vi = 0 then let wj = 1vj vj and go on withm2 . If one is

    calculating with vj instead of wj this step can be carried out in the

    end.

    If vj = 0 then uj was linearly dependent of u1 to uj1. In this caseuj is deleted from the starting set of vectors and the algorithm

    goes on with the next vector.

    If the ui are linearly independent this case cannot occur.

    1.1.6 Norms

    1.1.6.1 Definition

    A norm on a vector space V is a function . : V R, x x Rwith the following properties:

    (i) x 0 and x = 0 x = 0 (definiteness)(ii) x = ||x (homogeneity)

    (iii) x + y x + y (triangle inequality)

    1.1.6.2 Examples

    (i) The euclidean norm on Kn x2 = |x| = x, x(ii) The 1-norm: x1 = |x1| + |x|2 + + |x|n

    (iii) The -norm: x = max{|x1|, |x|2, , |x|n}

    (iv) On C([a, b]) we define f2 :=b

    a|f(x)|2 dx

    1/2

    Remark In (i)(iii) we have ek = 1.

  • 8/2/2019 A Em Theory

    18/124

    AEM 1- 9

    1.1.6.3 Lemma: Cauchy-Schwarz and Minkowski inequalities

    Let

    ., .

    be a real or complex scalar product, i.e.

    ., .

    is linear in the

    first argument and u, v = v , u with u, u = 0 u = 0.(i) | u, v | u, u 1/2 v , v 1/2

    (ii) With u := u, u 1/2 is a norm, especially u+ v u + v.(i) is called Cauchy-Schwarz inequality, (ii) is the Minkowski inequality.

    1.1.6.4 Comparison of norms

    It is easy to see that x x2 x1 nx holds. Therefore,one can define: a sequence xk approaches zero if the real sequence xkhas the limit zero, and the choice of the norm doesnt make a difference.

    Naturally xk x (xk x) 0 xk x 0.

    1.2 Matrices and Linear Maps

    1.2.1 Matrices

    1.2.1.1 Definition

    In the most cases it is sufficient to regard a matrix as a rectangular

    scheme consisting of column-vectors:

    A = (ai j) i=1..mj=1..n

    =

    a11 a12 a1na21 a22 a2n

    ......

    . . ....

    am1 am2 amn

    =

    | | |a1 a2 an...

    ......

    ...

    | | |

    A matrix with an equal number of rows and columns is called square

    matrix.

  • 8/2/2019 A Em Theory

    19/124

    AEM 1- 10

    1.2.1.2 Special types of square matrices

    1 0 0 00 1 0 00 0 1 0...

    ......

    . . ....

    0 0 0 1

    Identity-matrix

    En or Inor E or I

    d1 0 0 00 d2 0 00 0 d3 0...

    ......

    . . ....

    0 0 0 dn

    diagonal-matrix

    0 0 0 0 0 0...

    ......

    . . ....

    lower

    triangular matrix

    0 0 0 ...

    ......

    . . ....

    0 0 0

    uppertriangular matrix

    Two matrices of the same size can be added by adding all entries. A

    matrix is multiplied by a scalar by multiplying each entry by .

    A =

    a11 a12 a21 a22

    ......

    . . .

    , B =

    b11 b12 b21 b22

    ......

    . . .

    ,

    A =

    a11 a12 a21 a22 ...

    .... . .

    A + B =a11 + b11 a12 + b12 a21 + b21 a22 + b22

    ......

    . . .

    ,

    1.2.1.3 Multiplication of Matrices and Vectors

    Let A be a matrix with k columns and b be an element ofKk.

    The product of the matrix A and the vector b = (b1, . . . , b k)T is the

    linear combination of the column-vectors of A with the coefficients b1

  • 8/2/2019 A Em Theory

    20/124

    AEM 1- 11

    to bk.

    |

    a1|

    |ak|

    b1...

    bk = b1a1 + + bkak

    The matrix A is multiplied with the matrix B by decomposing B into

    column-vectors and forming the corresponding matrix-vector-products.

    These products are written down in order.

    A |b1|

    |bk| = |A b1

    | |A bk

    |

    So the matrix-product is calculated in concrete situations:

    ai1 ai2 ai3

    b3j

    b1jb2j

    ci j

    C = ABA

    B

    ......

    ...

    ci j = ai1b1j + ai2b2j + ainb=

    nk=1

    aikbkj

    ci j = ai1b1j + ai2b2j + ainbnj =n

    k=1

    aikbkj

    On the other hand, if you define matrices as an (ordered) collection of

    row-vectors, the product bA consists of linear combinations of the rows

    of A with coefficients in b. Observe the order of multiplication!

    This leads to:

    The product AB of the matrices A and B is a matrix with

    the k-th column is a linear combination of the columns of A withcoefficients in the k-th columns of B

  • 8/2/2019 A Em Theory

    21/124

    AEM 1- 12

    the k-th row is a linear combination of the rows of B with coeffi-cients in the k-th row of A

    1.2.2 Linear Maps

    1.2.2.1 Definition: Linear map

    Let U and V be vector-spaces. A map L : U V is called linear, if forall x , y U and , K the following equation holds:

    L(x + y) = L(x) + L(y)

    If u1. . . un is a basis of U then L is completely determined by its action

    on the basis:

    Lu = L

    n

    i=1iui

    =

    n

    i=1iL(ui)

    Suppose that V has a basis v1. . . vm. Then each Lui has a representation

    Lui =m

    j=1

    aj ivj. The matrix A = (aj i)j=1..m,i=1..n is called the matrix

    associated to the linear map L. Note that this matrix depends not only

    on L itself, but also on the choice of the bases in U and V.

    Resuming this for the special case U = Kn and V = Km with the standard

    bases we have:The matrix of the linear map L : U V has in the k-th column theimage of ek.

    On the other hand every matrix with n rows and m columns defines a

    linear map Km Kn through L(x) := Ax.

    1.2.2.2 Definition: Rank

    The rank of a matrix is the rank of the corresponding homogeneous

    equation system defined in chapter 0.

  • 8/2/2019 A Em Theory

    22/124

    AEM 1- 13

    1.2.2.3 Rank theorem

    Let A be a matrix. Then the maximum number of linear independent

    columns is equal to the maximum numbers of linear independent rows.

    A matrix with the property that the rank is the minimum number of rows

    and columns is called to have full rank.

    1.2.2.4 Definition: Multilinear Maps

    (i) Let U1, . . . , Un und V be vectorspaces. A mapL : U1 U2 Un VL : (u1, u2, . . . , u n) L(u1, . . . , u n) Vis called multilinear ifL is linear in each component, i.e. L is linear

    in each uj if one fixes all other uk.

    (ii) Most important case: U1 = = Un.

    For n = 2 we have bilinear maps. They are called symmetric istL(u, v) = L(v , u) and hermitian if L(u, v) = L(v , u).

    A multilinear map with the property

    L( , uj, , uk, ) = L( , uk, , uj, )

    is called alternating.

    Properties of alternating maps:

    (iii) (1) L( , u, , u, ) = 0(2) If one of the vectors is a linear combination of the others,

    L( ) = 0.

    (3) For U1 = = Un = Kn

    and V =K

    there is exactly one Lwith

    L(e1, , en) = 1.

  • 8/2/2019 A Em Theory

    23/124

    AEM 1- 14

    In this case we have L(u1, , un) u1, , un lin. indepen-dant.

    This L is called determinante L( u1, , un) = det( u1, , un)(and is the well known determinante with the usual properties)(iv) Application: Cramers Rule

    Let a1, , an Kn be a basis, A = [a1, , an] a n n-matrixand b Kn.Then the equation system Ax = b is uniquely solvable with

    xj =det A

    jdet A , where A

    j is A with aj is replaced by b.

    1.2.3 Linear Equations

    1.2.3.1 Some Definitions

    A linear map L : U K is called linear functional.Let L : U V be linear. Then L is called

    epimorphism, if L is surjective isomorphism, if L is bijective (one-to-one) endomorphism, if U = V

    automorphism, if U = V and L is one-to-one.The rank of L is the dimension of the range of L in V. As the range is

    spanned by the column-vectors of the matrix representation, the rank of

    L is the rank of the corresponding matrix.

    1.2.3.2 Definition: Linear equation

    Let L : U V be a linear map, b V. An equation Lx = b iscalled linear equation. For b = 0 the equation is called homogeneous,

  • 8/2/2019 A Em Theory

    24/124

    AEM 1- 15

    otherwise inhomogeneous. The set of all solutions of the homogeneous

    equation is called the kernel of L, written ker L.

    From now on we assume that L is represented by the matrix A.

    1.2.3.3 Immediate Properties

    (i) The kernel is a subspace ofU

    (ii) For the homogeneous equation the dimension formula holds:

    dim ker L = dim U rank LThat means that one can choose freely n k parameters in thesolution of the equation Lx = 0.

    (iii) The general solution of the inhomogeneous equation is archived by

    adding one particular solution to all solutions of the homogeneous

    equation.

    (iv) The inhomogeneous equation is solvable iff the rank of A is equalto the rank of the extended matrix (A|b).(v) For square n n-matrices A the following holds:

    The inhomogeneous equation is solvable for each right side b

    The homogeneous equation is uniquely solvable det A = 0 A has rank n ker A = {0} A1 exists (A1 is defined below).

    1.2.4 Inverse map and Inverse Matrix

    Let Lx = b be a linear equation that is uniquely solvable for all b.

    Then the map b x is well defined, and this map is called L1, theinverse map of L.

  • 8/2/2019 A Em Theory

    25/124

    AEM 1- 16

    1.2.4.1 Consequences

    (i) L1 is a linear map from V to U.

    (ii) In the finite dimensional case the matrix associated to L must be

    square.

    Let A be a n n-square matrix with rank n. Then each equation systemAx = b is uniquely solvable. The matrix B = [v1, . . . , vn] containing the

    solutions Avj = ej is called the inverse of A, A1 = B.

    A is called regular or invertible.

    1.2.4.2 Properties

    A1A = AA1 = E

    From now on we restrict ourselves to the case that the linear map is

    defined between Rn

    and Rm

    or between Cn

    and Cm

    .

    1.2.4.3 Correspondences between Linear Maps and Matrices

    Linear Map L Matrix A

    Application to a vector L(x) Multiplication matrix - vector Ax

    Identity map I(x) = x Identity matrix E with Ex = x

    Zero map: O(x) = 0 Zero matrix 0 with 0x = 0

    Composition L1 L2 Matrix-multiplication A1A2Inverse map L1 Inverse matrix A1

  • 8/2/2019 A Em Theory

    26/124

    AEM 1- 17

    1.2.5 Changing the Basis

    In the beginning of the section was mentioned that the matrix of a given

    map L : Kn Km contains in the columns the coordinates of the imagesof the basis ofKn with respect to the basis ofKm. Now we can ask how

    the matrix changes when we choose other bases in Kn or Km.

    1.2.5.1 Coordinates with Respect to a Basis

    Let u1, , un be a basis ofKn

    . Then the matrix U = (u1 un) is in-vertible. To gain the coordinates a of a point x with respect to u1, , unwe write

    x = Ua a = U1x .If v1 vn is another basis ofKn we have with V = (v1 vn)

    x = Ua = Vb b = V1Ua a = U1Vb

    1.2.5.2 Matrix and Change of Coordinates

    This uses the same method as in the paragraph above: x Kn hasthe representations x = Ua = Vb and y Km has the representationsy = W c = Zd.

    Let A be the matrix of L with respect to the basis U and W. Using the

    last paragraph we have

    L(x) = y Aa = c AU1Vb = W1Zd Z1W AU1Vb = d.A special case is the change of basis of an endomorphism: With W = U

    and Z = V the last formula reduces to

    Aa = c

    V1UAU1Vb = d

    (V1U)A(V1U)1b = d

    In the even more special case U = W = E we have

    Aa = c V1AVb = d .

  • 8/2/2019 A Em Theory

    27/124

    AEM 1- 18

    1.2.6 Some Special Linear Maps in R2

    (i) Identity and zero maps E and 0.

    (ii) Homogeneous scaling E =

    0

    0

    (iii) Rotation with the angle

    cos sin sin cos

    .

    (iv) Shears as 1 1

    0 1(v) Reflections.

    Let a = 1 and g be the straight line with direction a. Thereflection at g has the matrix

    Sg =

    2a21 1 2a1a2

    2a1a2 2a22 1

    = 2aaT E

    (vi) The reflection at zero has the matrix E = 1 00 1.

  • 8/2/2019 A Em Theory

    28/124

    AEM 1- 19

    1.2.7 Examples

    1 2 3 4

    5 6 7 8

    1 :

    1 0

    0 1

    2 :

    1.5 0

    0 0.5

    3 :

    0.75 0

    0 1

    4 :

    1 1

    0 1

    5 :

    1 00 1

    6 :

    1 0

    0 1

    7 :

    1 00 1

    8 :

    a aa a

    , a =

    12

    1.3 Operations with matrices

    The transpose AT of a matrix A is the matrix with columns and rows ex-

    changed. The transpose of a mn-matrix is a nm-matrix. This meansfor square matrices, that everything is mirrored at the first diagonal.

    The adjoint A of an (complex) matrix A is constructed by replacing allentries of the transpose by their complex conjugates.

    A square matrix is called symmetric, if it is equal to its transpose. It

    is called self-adjoint or hermitian, if it is equal to its adjoint. For real

    matrices these term coincide.

    A matrix with A = AT is called skew-symmetric, if A = A, A iscalled skew-hermitian.

  • 8/2/2019 A Em Theory

    29/124

    AEM 1- 20

    Often it is useful to regard vectors as matrices with one column and n

    rows. The numbers in R or C correspondent to the 1 1-matrices.

    1.3.1 Matrix-algebra

    A+B = B +A (A+B) = A+B (A+B)+C = A+(B +C)

    (A + B)C = AC+ BC A(B + C) = AB + AC (AB)C = A(BC)

    Attention! In general is AB = BA.Let A and B be invertible n n-matrices. Then AB is invertible and thefollowing rules hold:

    (AB)1 = B1A1 (A)1 =1

    A1.

    AE = EA = A A0 = 0A = 0

    (A1)1 = A (AT )T = A (A) = A

    (A + B)T = AT + BT (A)T = AT (AB)T = BTAT

    (A + B) = A + B (A) = A (AB) = BA

    (A1)T = (AT )1 (A1) = (A)1

    1.3.1.1 Block Matrices

    If a matrix is divided into blocks by horizontal or vertical lines one can

    calculate with these block as if they were entries in a common matrix

    (exception: determinants!). The blocks have to fit in size. Example:

    A1 A2A3 A4 B1 0

    Ek B2 = A1B1 + A2 A2B2

    A3B1 + A4 A4B2Here O denotes a matrix consisting only of zeroes and Ek a kk identitymatrix.

  • 8/2/2019 A Em Theory

    30/124

    AEM 1- 21

    1.3.2 Scalar Product

    The role of the transpose resp. adjoint matrix becomes clearer if we if

    we regard the scalar product as a matrix product:

    < u, v >=n

    i=1

    uivi = vu (complex case)

    < u, v >=n

    i=1

    uivi = vT u (real case).

    So we have

    < Au, v >= vTAu = vTAT T u = (ATv)T u =< u, ATv >

    and analogously in the complex case < Au, v >=< u, Av >.

    This property characterizes the transpose matrix:

    Let < Au, v >=< u, Bv > for all u, v Rn. If one chooses u = ei andv = ej one has < Aei, ej >= aj i and < ei, Bej >= bi j, so B = A

    T.

    1.3.3 Homogeneous Coordinates

    With matrix multiplication one can describe rotations, stretchings, shear-

    ings or reflections (and combinations of these), but as the origin always

    remains fixed, translations are not possible. This difficulty can be over-

    come by using homogeneous coordinates. Homogeneous coordinates inR3 consist of four coordinates, where the fourth coordinate must not be

    zero. A point (x , y , z ) R3 is represented by any vector of the form[a x , a y , a z , a]T. Especially [x , y , z , 1]T is a representant of [x , y , z ]T.

    Then we have the following correspondeces:

  • 8/2/2019 A Em Theory

    31/124

    AEM 1- 22

    cartesian coordinates homogeneous coordinates

    x =

    x1x2

    x3

    y =

    x1x2x31

    or y =

    ax1ax2ax3

    a

    x Ax y By with B =

    0A 0

    0

    0 0 0 1

    x x + v y By with B =

    1 0 0 v10 1 0 v20 0 1 v30 0 0 1

    1.3.4 Norms

    Definition of norms of linear maps Let U and V be normed vectorspaces and let L(U, V) denote the vectorspace of all linear maps from

    U to V. A norm on L(U, V) is a real valued function with the following

    properties:

    If A, B L(U, V) then(i) A 0 and A = 0 A = 0, the zero-map (definiteness)

    (ii) A = ||A (homogeneity)(iii) A + B A + B (triangle inequality)(iv) AB AB

  • 8/2/2019 A Em Theory

    32/124

    AEM 1- 23

    In the finite dimensional case linear maps are represented by matrices,

    and the norm is called matrix-norm. Other notation: operator-norm

    In general, a vector-norm . a and a matrix-norm . b are compatibleif for each vector x and each matrix A the inequality Ax Axholds. The norm-definition below produces compatible matrix-norms.

    Definition Let . i be a (vector)-norm in Kn and A be a nn matrix.We define the matrix-norm A generated by . by

    A = max{Ax | x = 1} = max{Ax | x 1}.Then one has A = min{C | for all x U one has Ax Cx}The norm generated by the vector-norms . 1 and . above aredenoted by the same symbol.

    Lemma

    (i) A1 = max1jn

    ni=1

    |ai j| (largest sum of columns)

    (ii) A2 is the first (and largest) singular value of A (will be definedlater)

    (iii) A = max1inn

    j=1 |ai j| (largest sum of rows)

    (iv) As = n

    i ,j=1

    |ai j|2 is compatible with . 2 (Frobenius Norm).

  • 8/2/2019 A Em Theory

    33/124

    AEM 1- 24

    1.4 Gauss Algorithm and LU-Decomposition

    1.4.1 Numerical Stability

    We will study some small equation systems and the effect of rounding

    errors onto the solutions.

    Example system:

    104x + y = 1

    x + y = 2

    Solution with the Gaua-Algorithm, exact calculation:1

    100001 1

    1 1 2

    110000

    1 1

    0 9999 9998

    110000

    1 1

    0 1 99989999

    110000 0 199990 1 9998

    9999

    1 0 1000099990 1 9998

    9999

    and so x 1 and y 1Now the same calculation with three significant digits, i.e. all numbers

    are rounded to the next number with three digits of the form x = 0.abc10p.

    0.0001 1 1

    1 1 2

    0.0001 1 1

    0 10000 10000

    0.0001 1 1

    0 1 1

    0.0001 0 0

    0 1 1

    1 0 0

    0 1 1

    x = 0

    y = 1

    This solution is unusable.

    This can be avoided by pivoting: choose the entry in the first column

    below the diagonal (the diagonal included) with the largest absolute value

    and put it into the diagonal by exchanging rows. Then go on with Gaua

  • 8/2/2019 A Em Theory

    34/124

    AEM 1- 25

    algorithm. IfA is invertible then the pivot elements are unequal to zero.

    This results in the following:

    0.0001 1 11 1 2

    1 1 20.0001 1 1

    1 1 20 0.9999 0.9998

    1 1 2

    0 1 1

    1 0 1

    0 1 1

    Other problems may arise. Example two is example one after multiplying

    row 1 by 20000. Again the calculations use three significant digits.2 20000 20000

    1 1 2

    2 20000 20000

    0 10000 10000

    2 20000 20000

    0 1 1

    2 0 0

    0 1 1

    x = 0

    y = 1

    So this solution is unusable, too.This effect can be avoided by equilibration.This means that each equa-

    tion is multiplied with a factor so that the sum of the absolute values of

    the row,

    nk=1

    |aik| is equal to one.

    Applying this one gets

    2 20000 200001 1 2

    220002

    2000020002

    2000020002

    12

    12

    1

    0.0001 1 1

    0.5 0.5 1

    Then pivoting gives

    0.5 0.5 1

    0.0001 1 1 0.5 0.5 1

    0 1 1 x = 1

    y = 1

    Conclusion: pivoting and equilibration can help to avoid problems caused

    by rounding errors.

  • 8/2/2019 A Em Theory

    35/124

    AEM 1- 26

    1.4.2 Special Operations

    Always we assume that the sizes of the matrices fit so that the products

    can be performed.

    Let be a real or complex number, k = l. Now define the followingn n- square matrices:Definition

    C(k, l; ) = (ci j)i ,j=1..n with ci j = 1 for i = j

    for i = k, j = l

    0 otherwise

    D(k; ) = (di j)i ,j=1..n with di j =

    1 for i = j and i = k for i = j = k

    0 otherwise

    F(k, l) = (fi j)i ,j=1..n with fi j = 1 for i = j and i = k and i = l1 for i = k, j = l

    1 for i = l , j = k0 otherwise

    Decompose A into column-vectors ai and row-vectors bj:

    A = ak al =

    bk bl

    Multiplication from the left side does operations with rows:

    C(k, l; )A =

    bk + bl

    bl

    D(k; )A =

    bk bl

    F(k, l)A =

    bl bk

    Multiplication from the right side does operations with columns:

  • 8/2/2019 A Em Theory

    36/124

    AEM 1- 27

    AC(k, l; ) = ak al + ak ,

    AD(k; ) =

    ak al AF(k, l) = al ak

    Observe that multiplication with C(k, l; ) from the right changes col-

    umn l while multiplication from the left changes row k.

    1.4.3 Properties of C(k, l; ), D(k; ) and F(k, l)

    (i) C(k, l; )1 = C(k, l; )(ii) C(k, l; )C(k, m;) = C(k, m;)C(k, l; )

    (iii) C(k, l; 0) = E

    (iv) For = 0 we have D(k; )1 = D(k; 1)(v) F(k, l)1 = F(k, l) = F(k, l)T

    1.4.4 Standard Algorithm

    Standard operations in the Gauss algorithm are

    (i) adding row l multiplied by to row k

    (ii) multiplying row k by

    = 0

    (iii) exchanging rows k and l.

    These operations can be described with aid of the fundamental matrices

    C(k, l; ), D(k; ) and F(k, l). To see this we write the system Ax = b

    as an augmented matrix S = (A|b).Then the operations (i) to (iii) from above are

    (i) multiply S with C(k, l; ) from the left(ii) multiply S with D(k; ) from the left

    (iii) multiply S with F(k, l) from the left.

  • 8/2/2019 A Em Theory

    37/124

    AEM 1- 28

    As all appearing matrices are invertible we see that the Gauss algorithm

    gives equivalent transformations and so preserves the set of solutions.

    If the system Ax =b is uniquely solvable it is sufficient to reach theform

    d11 . . . ...0 dnn

    From the last equation it one can read directly the value of xn, and

    substituting the already determined variables the solution is calculated

    recursively from the bottom to the top.

    1.4.5 LU-Decomposition

    The LU-decomposition

    is an effective method in solving many equations with the sameleft side decomposes a given square matrix A as A = P LU with

    (i) P is a permutation matrix, i.e. P has exactly one 1 in each

    column and row, and all other entries are zero.

    (ii) L is a lower triangular matrix

    (iii) U is an upper triangular matrix

    1.4.5.1 Description of the Algorithm - simple case with P=E

    The algorithm consists of a series of transformations of the matrix A.

    With L0 = E, U0 = A we calculate

    A = EA = L0U0 = LkUk = = LnUn =: LU.

    The matrices Lk and Uk are block-diagonal as shown:

  • 8/2/2019 A Em Theory

    38/124

    AEM 1- 29

    Lk =

    1 0. .

    . 10

    1 0

    . . .

    0 1

    k

    Uk =

    u1 . .

    .0 uk

    0

    . . .

    k

    Lk1 =

    1 0. . .

    10

    1 0

    . . .

    0 1

    k1

    Uk1 =

    u1 . . .

    0 uk1

    0

    z . . .

    k1

    m1 We start with A = Lk1Uk1. Let Uk1 = (ui j).

    During this simple case we assume that z = ukk = 0.To each row from row k + 1 to the last in Uk1 we add the row

    k multiplied with j :=

    ujk

    ukk. This results in having zeroes in

    column k from row k + 1 to the bottom.

    These actions expressed with matrices: Uk1 is multiplied from theleft with C( j,k; j).

    Recall the facts that the inverse of C( j,k; j) is C( j,k; j) andthat matrices C( j,k; ) and C(i , k;) commute. So we have

    A = Lk1Uk1 = Lk1C(k + 1, k; k+1)C(k + 1, k; k+1)Uk1= Lk1C(k + 1, k; k+1)C(k + 1, k; k+1)

    C(n, k; n)C(n, k; n)Uk1

  • 8/2/2019 A Em Theory

    39/124

    AEM 1- 30

    =

    Lk1C(k + 1, k; k+1) C(n, k; n)

    C(k + 1, k; k+1) C(n, k; n)Uk1=: LkUk.How is Lk build from Lk1?

    The action of the matrices is adding multiples of the columns k+1

    to n to column k. Obviously only column k is changed by this

    process, and contains in the places k + 1 to n the negative of the

    factors used by the transforming of Uk1.As an example we write down L1 in the case U0 = A = (ai j):

    L1 =

    1 0 0a21a11

    1 0...

    .... . .

    ...an1a11

    0 1

    m2 Recursively now repeat step m1

    When the algorithm ends we have

    A = LU with

    L =

    1 0 0 1 . . . ...... . . . . . . 0 1

    and U =

    u11 0 u22

    . . ....

    .

    .. . . . . . . 0 0 unn

    The ui i are non-zero.

    m3 Now we have

    Ax = LUx = L Uxy = Ly =b

    (i) Ly = b is solved recursively beginning with the first compo-

    nent of y.

  • 8/2/2019 A Em Theory

    40/124

    AEM 1- 31

    (ii) Ux = y is solved recursively beginning with the last compo-

    nent of x.

    1.4.5.2 Remark

    det A = det L det U = u11 unn.

    1.4.6 Example

    Let A =

    1 2 42 3 81 3 1

    and b =36

    0

    . Solve Ax = bm1 Start with the LU-decomposition of A.

    [L0|U0] = 1 0 0 1 2 4

    0 1 0 2 3 80 0 1 1 3 1 [L1|U1] =

    1 0 0 1 2 42 1 0 0 1 0

    1 0 1 0 1 3

    [L2|

    U2

    ] = 1 0 0 1 2 4

    2 1 0 0

    1 0

    1 1 1 0 0 3 m2 Solve Ly = b.

    Line by line one has y1 = 3, y2 = 0 and y3 = 3.

    m3 Solve Ux = y.

    Line by line (from the bottom to the top) one has

    x3 = 1, x2 = 0 and x1 = 1, so x =1 0 1T.

  • 8/2/2019 A Em Theory

    41/124

    AEM 1- 32

    1.4.6.1 LU-decomposition, general case

    This general case brings two extensions:

    A may be singular pivoting is possible

    Now we construct a decomposition A = P LU. We start with P0 := E,

    L0 = E and U0 = A

    If the element z in Uk1 is zero and in the rest of the column k there

    are only zeroes, too, then the matrix A is singular. In this case letUk := Uk1 and Lk := Lk1. We will get a LU-decomposition of A withsome diagonal elements of U being zero. This can only happen is A is

    singular.

    If in the row l > k of the column k there is an entry with an larger

    absolute value, then exchange the rows k and l of Uk1.

    This is a multiplication of Uk1 from the left with F(k, l). RememberingF(k, l)F(k, l) = E we get

    A = Pk1Lk1Uk1 =

    Pk1Lk1F(k, l)

    F(k, l)Uk1

    =:

    Pk1Lk1Fki

    Uk1.

    The matrixUk1 is Uk1 with rows i and k exchanged and therefore hasa non-zero element in position z.

    The action of right multiplication with F(k, l) on Lk1 is interchangingcolumns k and i. As these columns consist of zeroes with only one 1

    in each case this can be undone by interchanging the rows k and i, i.e.

    multiplying Lk1 with F(k, l) from the left. But doing so interchangesthe first k

    1 positions of these rows too, so that one has to undo this.

    Resuming this we have this step in the algorithm: Set Pk := Pk1F(k, l)and Lk1 is Lk1 with the first k 1 columns of the rows k and linterchanged.

  • 8/2/2019 A Em Theory

    42/124

    AEM 1- 33

    1

    1

    1

    kl

    Pk1 Pk Lk1 Lk1 Uk1 Uk1

    lk

    k l

    Now construct Uk and Lk as in the simple case from Uk1 and Lk1

    and get A = PkLkUk.

    In the end we have P1 = PT. As P is a product of matrices F(k, l)and F(k, l)1 = F(k, l)T this is true for P, too, because of:

    Let AT = A1 and BT = B1. Then (AB)T = BTAT = B1A1 =(AB)1.

    1.4.7 Summary of LU-decomposition

    Solving a linear equation system Ax = b with LU-decompostion consists

    of the following steps:

    m1 Start with P0 = L0 = En, U0 = A.

    m2 For each k from 1 to n perform

    Exchanging rowsUk1 is Uk1 with rows k and l > k exchanged, Lk1 is Lk1where the first k 1 entries in rows k and l are exchanged(only if k > 1), and exchanging columns k and l in Pk1 gives

    PkIf you skip this step just put Pk := Pk1, Lk1 := Lk1 andUk1 := Uk1

  • 8/2/2019 A Em Theory

    43/124

    AEM 1- 34

    Adding multiples of row k to the rows belowAdding in Uk1 the l-fold row to the rows l with l > kgives Uk, and Lk is Lk

    1 with entries

    l in row l of

    column k.

    With P := Pn, L := Ln and U := Un this gives the decomposition

    A = P LU.

    In case of different right sides bj in the equation system, this step

    has to be carried out only once.

    m3 Solve P z = b by z = PT b

    m4 Solve Ly = z recursively starting with y1.

    m5 Solve Ux = y recursively starting with xn.

    At an arbitrary point you can make a crosscheck whether you made

    mistakes during the calculation: always PkLkUk and PkLk1Uk1 must

    be equal to A.

    1.4.7.1 Remarks

    (i) The first step in the LU-Decomposition can be used to do pivoting;

    i.e. you can always put the entry with the largest absolute value

    into the umm-position. This results in higher numerical stability.

    (ii) P arises from the identity-matrix by interchanging rows. Thereforeit is not necessary to write down the complete matrix. One only

    has to keep notice what coordinates are interchanged.

    1.4.8 Example of LU-Decomposition

    A =

    6 5 3 103 7 3 512 4 4 40 12 0 8

  • 8/2/2019 A Em Theory

    44/124

    AEM 1- 35

    [P0|L0|U0] = [E|E|A]

    =

    1 0 0 0 1 0 0 0 6 5 3 100 1 0 0 0 1 0 0 3 7 3 50 0 1 0 0 0 1 0 12 4 4 4

    0 0 0 1 0 0 0 1 0 12 0 8

    [P1|L0|U0] = 0 0 1 0 1 0 0 0 12 4 4 4

    0 1 0 0 0 1 0 0 3 7 3 51 0 0 0 0 0 1 0 6 5 3 100 0 0 1 0 0 0 1 0 12 0 8

    [P1|L1|U1] =

    0 0 1 0 1 0 0 0 12 4 4 4

    0 1 0 0 1/4 1 0 0 0 6 4 41 0 0 0 1/2 0 1 0 0 3 1

    12

    0 0 0 1 0 0 0 1 0 12 0 8

    [P2|L1|U1] =

    0 0 1 0 1 0 0 0 12 4 4 4

    0 0 0 1 0 1 0 0 0 12 0 81 0 0 0 1/2 0 1 0 0 3 1 120 1 0 0 1/4 0 0 1 0 6 4 4

    [P2|L2|U2] =

    0 0 1 0 1 0 0 0 12 4 4 4

    0 0 0 1 0 1 0 0 0 12 0 81 0 0 0 1/2 1/4 1 0 0 0 1 100 1 0 0 1/4 1/2 0 1 0 0 4 8

    [P3|L3|U3] = 0 0 0 1 1 0 0 0 12 4 4 40 0 1 0 0 1 0 0 0 12 0 81 0 0 0 1/4 1/2 1 0 0 0 4 8

    0 1 0 0 1/2 1/4 0 1 0 0 1 10

  • 8/2/2019 A Em Theory

    45/124

    AEM 1- 36

    [P3|L3|U3] = 0 0 0 1 1 0 0 0 12 4 4 4

    0 0 1 0 0 1 0 0 0 12 0 81 0 0 0 1/4 1/2 1 0 0 0 4 80 1 0 0 1/2 1/4 1/4 1 0 0 0 8

    A = P LU = P3L3U3 with

    P = 0 0 0 1

    0 0 1 01 0 0 0

    0 1 0 0

    L = 1 0 0 0

    0 1 0 01/4 1/2 1 01/2 1/4 1/4 1

    U = 12 4 4 4

    0 12 0 80 0 4 80 0 0 8

    1.4.9 Solving a Linear Equation System

    Ax = b with A =

    6 5 3

    10

    3 7 3 512 4 4 4

    0 12 0 8

    and b = 10

    148

    8

    m1 Solve P z = b

    z=

    PT b=

    0 0 1 0

    0 0 0 1

    0 1 0 01 0 0 0

    1014

    88

    = 8

    8

    1410

    .m2 Solve Ly = z, i.e.

    1 0 0 0

    0 1 0 01/4 1/2 1 01/2 1/4 1/4 1

    y1y2y3y4

    =

    8

    814

    10

    .

    Line by line one has y1 = 8, y2 = 8, 2 4 + y3 = 14 y3 = 16and 4 2 4 + y4 = 10 y4 = 8.

  • 8/2/2019 A Em Theory

    46/124

    AEM 1- 37

    m3 Solve Ux = y, i.e.

    12 4 4 4

    0 12 0 80 0 4 80 0 0 8

    x1

    x2x3x4

    = 8

    8168

    .line by line one has

    8x4 = 8 x4 = 1, 4x3 + 8 = 16 x3 = 2, 12x2 8 =8 x2 = 0 and 12x1 8 + 4 = 8 x1 = 1, sox = 1 0 2 1

    T

    .

    1.4.10 Short Form

    (i) Use the zeroes in the U-matrix to store the elements below the

    diagonal of the L-matrix.

    Divide these areas of the U-matrix by a line.

    (ii) Instead of the P-matrix use a vector (initiallyp = [1 2 3 4]T) con-

    taining the numbers of the rows of the right-side vector b.

    Then a pivoting operation results in exchanging whole rows in U and p.

    1.4.11 Example

    [P0|L0|U0] = [E|E|A] :

    6 5 3 103 7 3 5

    12 4 4 4

    0 12 0 8

    1

    2

    3

    4

    .

    [P1|L0|U0] : 12 4 4 4

    3 7 3 56 5 3 100 12 0 8

    3

    21

    4

  • 8/2/2019 A Em Theory

    47/124

    AEM 1- 38

    [P1|L1|U1] : 12 4 4 41

    /4 6 4 41/2 3 1 120 12 0 8

    3

    21

    4

    [P2|L1|U1] :

    12 4 4 4

    0 12 0 81/2 3 1

    12

    1/4 6 4 4

    3

    4

    1

    2

    [P2|L2|U2] =

    12 4 4 4

    0 12 0 81/2 1/4 1 101/4 1/2 4 8

    3

    4

    1

    2

    [P3|L3|U3] :

    12 4 4 4

    0 12 0 81/4 1/2 4 81/2 1/4 1 10

    3

    4

    2

    1

    [P3|L3|U3] =

    12 4 4 40 12 0 81/4 1/2 4 81/2 1/4 1/4 8

    34

    2

    1

    Decompose this and put the L- and U-parts into the right form:

    L =

    1 0 0 00 1 0 01/4 1/2 1 01/2 1/4 1/4 1

    und U =12 4 4 40 12 0 80 0 4 8

    0 0 0 8

  • 8/2/2019 A Em Theory

    48/124

    AEM 1- 39

    In z = PT b one has

    b = b1

    b2b3b4

    = 10

    148

    8

    so z = b3

    b4b2b1

    = 8

    814

    10

    and the rest is as above.

    If one wants P explicitly one has from p: P = [ e3, e4, e2, e1].

  • 8/2/2019 A Em Theory

    49/124

    AEM 1- 40

    1.5 Eigenvalues and Eigenvectors

    1.5.1 Definition and propertiesLet A be a square matrix.

    (i) If C and v = 0 is a vector with Av = v, then v is calledeigenvector of A to the eigenvalue .

    (ii) It is Av = v with v = 0 there is a vector v = 0 with (A E)v = 0 the kernel of A E is non-trivial A E is not regular det(A E) = 0As det(A E) is a polynomial of degree n in , we definep() = det(A E) is called characteristic polynomial von A.

    Therefore a (complex) number is an eigenvalue of A, if is azero of the characteristic polynomials.

    (iii) A has at least one eigenvalue and at least one eigenvector to each

    eigenvalue.

    (iv) If is a k-fold zero of p, then o() = k is called the algebraic

    multiplicity of .

    The geometric multiplicity () is the dimension of the kernel ofA E, that is dimension of eigenspace of A and .(v) A vector v is called generalized eigenvector of the k-th order to ,

    if the following holds:

    (A E)kv = 0, but (A E)k1v = 0.

    (vi) Because of (A

    E)0v = Ev = v the eigenvectors are just the

    generalized eigenvectors of first order. If v is a generalized eigen-

    vector of k-th order then (A E)v is a generalized eigenvectorof order (k 1).

  • 8/2/2019 A Em Theory

    50/124

    AEM 1- 41

    1.5.2 More properties

    (i) Let C = P AP1. Then A and C have the same characteristicpolynomial.

    (ii) Ifv is a (generalised) eigenvector of A then P v ist a (generalized)

    eigenvector of C (of the same order).

    (iii) Let A be a square kk-matrix with the property that the diagonaland everything below the diagonal is zero.

    Then Ak = 0.

    (iv) Let A be an (upper or lower) triangular matrix. Then the eigen-

    values of A are the diagonal elements.

    This shows that eigenvalues are properties of the linear map rather than

    of the representing matrix.

    1.5.3 Lemma

    Let C be a m m-matrix. Then there exists an invertible m m-matrixP so that

    S = P1CP =

    0 ...

    ......

    0

    where is an eigenvalue of C.

    1.5.4 Theorem: Schur Form

    Let A be a n

    n-matrix. Then there exists an invertible matrix P and

    an upper triangular matrix U with A = P UP1.

    U has the same characteristic polynomial as A, so the diagonal ofU are

    the eigenvalues of A with the same multiplicity.

  • 8/2/2019 A Em Theory

    51/124

    AEM 1- 42

    1.5.5 Consequences

    (i)

    Always 1

    ()

    o()

    n holds.

    If () < o() then for sufficient large k the dimension ofthe kernel of (A E)k is equal to the algebraic multiplicityo().

    (ii) The generalized eigenspace to is the span of all generalized

    eigenvectors to . Its dimension is o(), i.e. there are in total

    as many linearly independent generalized eigenvectors to as the

    order of as a zero of the characteristic polynomial.In particular for a simple zero of the characteristic polynomial we

    have: there is a one-dimensional eigenspace and there are no gen-

    eralized eigenvectors of higher order.

    (iii) (generalized) eigenvectors to distinct eigenvalues are linearly inde-

    pendent.

    (iv) A real matrix is called (real) diagonalisable, if(1) the characteristic polynomial has only real zeroes

    (2) for each zero the algebraic and the geometric multiplicity are

    equal.

    This means that there is a basis of theRn consisting of eigenvectors

    of A resp. that there are no generalized eigenvectors of higher

    order.

    (v) Accordingly a complex matrix is called complex diagonalisable if

    for every eigenvalue the algebraic and geometric multiplicity are

    the same.

    (vi) The spectrum of A is the set of eigenvalues, denoted by (A).

    1.5.6 Jordan-FormIf is an eigenvalue of the matrix A and v is a corresponding eigenvector,

    then Av = v.

  • 8/2/2019 A Em Theory

    52/124

    AEM 1- 43

    If v is a generalized eigenvector of order k+ 1 then u = (A E)v is ageneralized eigenvector of order k. In this case we have Av = v + u.

    Putting these two cases together we get the important theorem on theJordan-form of a matrix:

    1.5.6.1 Jordan-Form

    Let L be an endomorphism ofCn. Then there exists a basis ofCn so

    that in this basis L has an block-matrix representation

    J =

    J1 0 00 J2

    . . ....

    .... . .

    . . . 0

    0 0 Jp

    where Jr =

    r 1 0 00 r 1

    . . ....

    .... . .

    . . .. . .

    ......

    . . . r 1

    0

    0 r

    The numbers r are (not necessarily distinct) eigenvalues. The blocks

    Jr are Jordan-blocks.

    If Jr has the size k and u1 uk are the basis vectors associated to theblock Jr then we have

    Lu1 = ru1 and for 2 s k we have Lus = us + us1. ()

    That means that u1 is an eigenvector and us are generalized eigenvectorsof order s. The (ordered) set u1 uk alle called Jordan-chain.Now let u1,1 u1,k1, u2,1 u2,k2, up,1 up,kp be the Jordan chainsassosiated with the Jordan blocks J1 Jp. The matrix

    U = [u1,1 u1,k1 up,1 up,kp]fulfills

    AU = UJ A = UJU1 J = U1AU.This is easily seen be looking at the columnvectors in the products,

    because this is just the equation () in each column.

  • 8/2/2019 A Em Theory

    53/124

    AEM 1- 44

    1.5.6.2 Remark

    If each Jr has the size 1 then there exists a basis of eigenvectors and

    there are no generalized eigenvectors of order greater than one. In thiscase the matrix is diagonalisable.

  • 8/2/2019 A Em Theory

    54/124

    AEM 1- 45

    1.5.7 Example

    A :=

    2 0 1 0 0 0 0 0 0 0

    1 2 0 0 0 0 0 0 0 00 0 2 0 0 0 0 0 0 0

    0 0 0 2 0 0 0 1 0 0

    0 0 0 0 2 0 0 0 1 0

    0 0 0 1 0 2 0 0 0 0

    0 0 0 0 0 0 2 0 0 0

    0 0 0 0 0 0 0 2 0 0

    0 0 0 0 0 0 0 0 2 00 0 0 0 0 0 0 0 0 2

    p() = (2 )10, so 2 is 10-fold eigenvalue of A.

    B :=

    0 0 1 0 0 0 0 0 0 0

    1 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 1 0

    0 0 0 1 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    B2 =

    0 0 0 0 0 0 0 0 0 0

    0 0 1 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 1 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0 0

    Furthermore B3 = 0.

  • 8/2/2019 A Em Theory

    55/124

    AEM 1- 46

    U3

    U2

    U1

    s3 = 2

    s2 = 3

    s1 = 5

    r3 = 10 r2 = 8 r1 = 5 r0 = 0

    ker B3 ker B2 ker B1 ker B0

    One has

    v ker B Bv = 0 B1(Bv) = 0 Bv ker B1.

    So B is injective between U3, U2 and U1.

    b31

    b32 b22

    b21

    b23 b13

    b11

    b12

    b14

    b15

    B B

    U3 U2 U1

    Choose a basis b31 and b32 of U3.

    From this define

    (i) Bb31 = b21, Bb21 = b11 and Bb11 = 0. (Jordan chain of length 3)

    (ii) Bb32 = b22, Bb22 = b12 and Bb12 = 0. (Jordan chain of length 3)

    In the 3-dimensional space U2 the vectors b21 and b22 are completed toa basis by b23.So one has

    (iii) Bb23 = b13, Bb13 = 0. (Jordan chain of length 2)

  • 8/2/2019 A Em Theory

    56/124

    AEM 1- 47

    In the end the vectors in U1 that are already determined are completed

    to a basis.

    (iv) Bb14 =

    0 (Jordan chain of length 1)

    (v) Bb15 = 0 (Jordan chain of length 1)

    With this the map B is uniquely described in the basis bi j.

    If one observes Bv = 0 (A I)v = 0 Av = vBv = w (A I)v = w Av = v + w ,one has with

    b11, b21, b31,b12, b22, b32,b13, b23,b14 andb15

    the following matrix representation of A

    J :=

    1 0 0 0 0 0 0 0 00 1 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0

    0 0 0 1 0 0 0 0 0

    0 0 0 0 1 0 0 0 0

    0 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 1 0 0

    0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0

    0 0 0 0 0 0 0 0 0

    J is the (better a) Jordan form of the map A.

    Gather the vectors b11, b21, . . . , b15 in a matrix C. Then it follows

    AC = CJ, so A = CJC1.

  • 8/2/2019 A Em Theory

    57/124

    AEM 1- 48

    Calculation with numbers

    U1 is the kernel of B. It consists of all vectors having a zero in position

    1, 3, 4, 8 and 9. Because in general there is no canonical choice of baseswe describe U1 as

    U1 = [e2 e5, e2 + e5, e6 e2, e7 e2, e10 e2.]

    The kernel of B2 consists of all vectors having a zero in position 3 and

    8. So U1 is completed by

    U2 = [e1 + e4, e1 e4, e1 + e9]to a basis of ker B2.

    ker B3 consists of all vectors. So we choose

    U3 := [e3, e8].

    Now construct the Jordan chains:

    Be3 = e1, Be1 = e2 Be2 = 0 these are b31, b21 and b11

    Be8 = e4, Be4 = e6 Be6 = 0 these are b32, b22, and b12

    These are the chains of lenghth 3.

    In U2 we have to complete the images of the vectors of U3 (e1 and e4)to a basis. So we choose b23 = e1 + e9 and build the next Jordan chain

    B(e1 + e9) = e2 + e5, B(e2 + e5) = 0 these are b23 and b13

    In U1 the span of e2, e6 and e2 + e5 has to be completed to a basis.

    Therefore we choose

    b14 = e10 e2 and b15 = e7 e2.

  • 8/2/2019 A Em Theory

    58/124

    AEM 1- 49

    With this we have: in the basis b11, b21, b31, b12, b22, b32, b13, b23,b14und b15 A the form J stated above.

    Here we have C = (e2, e1, e3, e6, e4, e8, e2 + e5, e1 + e9, e10 e2, e7 e2)and so

    C =

    0 1 0 0 0 0 0 1 0 0

    1 0 0 0 0 0 1 0 1 10 0 1 0 0 0 0 0 0 0

    0 0 0 0 1 0 0 0 0 0

    0 0 0 0 0 0 1 0 0 0

    0 0 0 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1

    0 0 0 0 0 1 0 0 0 0

    0 0 0 0 0 0 0 1 0 0

    0 0 0 0 0 0 0 0 1 0

    und

    C1 =

    0 1 0 0

    1 0 1 0 0 1

    1 0 0 0 0 0 0 0 1 00 0 1 0 0 0 0 0 0 0

    0 0 0 0 0 1 0 0 0 0

    0 0 0 1 0 0 0 0 0 0

    0 0 0 0 0 0 0 1 0 0

    0 0 0 0 1 0 0 0 0 0

    0 0 0 0 0 0 0 0 1 0

    0 0 0 0 0 0 0 0 0 10 0 0 0 0 0 1 0 0 0

    The Jordan theorem now tells that A = CJC1 and J = C1AC.

  • 8/2/2019 A Em Theory

    59/124

    AEM 1- 50

    1.5.7.1 Algorithm

    We look for the Jordan form and transformation matrices of an endo-

    morphism A on Rn (or Cn), so A = CJC1.

    (i) Calculate p() = det(A E) and find all zeroes. These are theeigenvalues.

    (ii) For each eigenvalue perform the following process:

    m1 For construct B := A E and determine the spaces Ui,until the dimension of the kernel of Bk (this is equal to thesum of the dimensions of the Ui) is equal to the algebraic

    multiplicity of .

    This is done iteratively: first find (with aid of the Gaua-

    algorithm) a basis of the kernel of B. This is U1.

    Then compute B2 and find a basis of its kernel by completing

    the basis of U1 by other vectors. These completing vectorsform a basis of U2.

    Now find a basis of U3 by completing the basis of ker B2 by

    some vectors to a basis of ker B3 and so on.

    m2 Now construct the Jordan chains:

    the basis ofU3 (in general: Uk with the highest k) is mapped

    by B in U2; and then is completed to a basis ofU2 by vectorsthat have been computed in m1 .

    This basis is mapped by B; and the images are completed to

    a basis of U1.

    Each j-tuple v, Bv, . . . Bj1v of basis vectors with a startingvector v

    Uj forms a Jordan chain of length j.

    m3 When in total basis vectors are found, the work is donefor this eigenvalue.

  • 8/2/2019 A Em Theory

    60/124

    AEM 1- 51

    (iii) Each Jordan chain v, Bv, . . . Bj1v is written down in reverseorder (so starting with the eigenvector) Bj1v, Bj2v, . . . v andgathered to the matrix C.

    In the Jordan matrix J each chain corresponds to a Jordan block

    of size j j having the form J( j,) =

    1 0 00 1 0...

    .... . .

    . . ....

    0 0 10 0 0

    with

    the eigenvalue .The Jordan matrix J is the a block diagonal matrix consisting of

    the single Jordan blocks.

    1.6 Special Properties of Symmetric Matrices

    A matrix is called orthogonal, iff the columns form an orthonormal basis.Equivalently one can say

    AT = A1 or AT A = A AT = En.

    In the complex case a matrix is called unitary if

    A = A1 or A A = A A = En.

    The importance of these notions lies in the fact that for arbitrary vectors

    v and w and an orthogonal or unitary matrix A the following holds:

    Av = v and < Av , A w >=< ATAv , w >=< v , w >An orthogonal transformation doesnt change neither angles nor lengths.

    The proof of these facts is given below,

    This subsection contains facts about symmetric or hermitian matrices.

    Recall that a real matrix is called symmetric if A = A and a complexhermitian if A = A. For real matrices these definitions coincide.

    The following statements are formulated for the complex case, because

    the (more important) real case is contained in it.

  • 8/2/2019 A Em Theory

    61/124

    AEM 1- 52

    1.6.1 Properties of Symmetric and Hermitian Matrices

    Let A be a hermitian n

    n-matrix.

    (i) The eigenvalues of A are real.

    (ii) If = are eigenvalues and v1 and v2 are eigenvectors to resp., then < v1, v2 >= 0.

    (iii) For each eigenvalue the geometrical and the algebraic multiplicity

    are equal.

    (iv) There exists a ON-Basis of eigenvectors ofA(v) There is an unitary matrix U and a real diagonal matrix D with

    A = UDU. (Remember: U unitary U = U1.)

    1.6.2 Orthogonal Matrices

    A square matrix is called orthogonal (or unitary in the complex case)if ATA = E resp. AA = E. As the real case is more important, werestrict our further results to this case. The complex case can be proved

    analogously.

    1.6.2.1 Properties of Orthogonal Matrices

    The following statements are equivalent:

    (i) A is orthogonal.

    (ii) AT = A1.

    (iii) The columns of A form an orthonormal basis.

    (iv) The rows of A form an orthonormal basis.

    (v) For v, w Rn we have < v , w >=< Av , A w >.(vi) for each v Rn we have Av = v.

  • 8/2/2019 A Em Theory

    62/124

    AEM 1- 53

    1.6.2.2 Further Properties

    Let A be orthogonal.

    (i) For v , w one has

  • 8/2/2019 A Em Theory

    63/124

    AEM 1- 54

    (ix) Let vi and vj be elements of the ON-Basis of the eigenspace of

    ATA to = 0. Then we have

    i j = < vi, vj >=< vi, vj >=< AT

    Avi, vj >=< Avi, Avj > .

    This shows that Avi forms an orthogonal system and hence the

    dimension of the eigenspace of AAT to must be greater or equal

    than the dimension of the corresponding eigenspace of ATA.

    By symmetry it follows that these two numbers are equal.

    1.7.2 Existence and Construction of the SVD

    1.7.2.1 Theorem

    Let A be a m n-matrix.Then there exists an orthogonal nn-matrix V and an orthogonal mmmatrix U, and a m

    n-matrix S = (si j) with si i

    0 so that

    A = USVT.

    The matrix S = (si j) is a matrix of diagonal type, i.e. for i = j one hassi j = 0.

    1.7.2.2 Algorithm

    m1 Form B = ATA. This is an n n-matrix.m2 Compute the eigenvalues of B. These are non-negative and are

    numered in the sequence 1 2 k > k+1 = =n = 0. The fact that k is the rank of the matrix A ( and the rank

    of ATA too) can be used as a crosscheck.

    m3 Find an ON-basis v1, . . . , vn ofRn. here is vi eigenvector to theeigenvalue i. V := [v1, , vn] becomes an orthogonal matrix.(VT = V1).

  • 8/2/2019 A Em Theory

    64/124

    AEM 1- 55

    m4 The singular values of A are defined as si =

    i. The matrix

    S = (si j) is a matrix of diagonal type, i.e. for i = j one has si j = 0.S has the same shape as A, i.e. n columns and m rows. The

    elements in the diagonal are given by the singular values: si i = si.

    m5 For i k define the vectors ui = 1iAvi. They form an orthonor-mal system. Complete these vectors to an ON-basis u1, . . . , um of

    Rm and gather them into the matrix U = [u1, , um].m6 The singular value decomposition of A is

    A = USVT.

    1.7.2.3 Remark

    In many cases the vectors in V and U belonging to the eigenvalues zero

    are not needed. In this case the entries are denoted by stars (

    ) and

    are not explicitely calculated. This is called the simplified version of theSVD.

    1.7.2.4 Further Properties

    If A = USVT is the SVD of A then AT has the SVD AT = V S TUT. If A

    is invertible, then A1 = V S1UT.

    1.8 Generalized Inverses

    The singular value decomposition can be used to construct approximate

    solutions of (possibly) non-square linear equation systems.

    Given a mn-matrix A and an vector b Rm we are looking for a vectorx Rn so that the norm

    Ax b2 = min!

  • 8/2/2019 A Em Theory

    65/124

    AEM 1- 56

    Substituting the SVD of A and remembering that for the orthogonal

    matrix U we have that UT = U1 is orthogonal, too with u = UT ufor each u

    Rm we get

    Ax b2 = USVTx b2 = UTUSVTx UTb2

    = S VTxz

    UTbd

    2 ()

    The solutions of this equation are given by

    zj = 1sj

    dj j = 1 . . . k

    arbitrary j > k

    As V is orthogonal we get all solutions x as

    x = V z =

    kj=1

    1

    sjdjvj +

    nj=k+1

    zjvj.

    Because V is orthogonal, the norm of x is given by n

    j=1

    z2j 1/2

    . There-

    fore the solution with the smallest norm is

    x+ = V z =

    kj=1

    1

    sjdjvj.

    This solution is called pseudo-normal solution.One sees that the mapping b x+ is given by the matrix A+ := VSUTwith the diagonal-type matrix S := (ii j) where i is defined by i = 1

    sjfor j k

    0 for j > k.

    1.8.0.5 Definition

    The so defined matrix A+ is called generalized inverse or

    Moore-Penrose-inverse of A.

  • 8/2/2019 A Em Theory

    66/124

    AEM 1- 57

    1.8.0.6 Further Properties

    We have (AT)+ = (A+)T.

    1.8.1 Special case: A injectiv

    If A is injective then A has the rank n and the pseudo-normal solution

    of every equation Ax = b is unique. Furthermore in this case ATA is

    invertible (because of rank A = n and the rank of ATA is equal to the

    rank of A).

    In this case we can calculate x+ without explicit construction of the

    SVD: using ATA = V S TUTUSV = V STSV we get from the equation

    () above:

    SVTx+ = UT b V S TSVTx+ = V S TUT b

    V ST

    SVT

    ATA

    x+

    = V ST

    UT

    ATb A

    T

    Ax+

    = AT

    b x+

    = (AT

    A)1

    AT

    b

    So in this case

    A+ = (ATA)1AT

    If one wants only x+ it is sufficient to solve ATAx+ = ATb.

  • 8/2/2019 A Em Theory

    67/124

    AEM 1- 58

    1.9 Applications to linear equation systems

    1.9.1 Errors1.9.1.1 Introductory example

    Ax = b with A =

    2 3 42 3 4.001

    3 4 5

    and b =

    11

    1

    One easily sees that A is invertible. The solution x is uniquely deter-

    mined:

    The exact solution is x =

    11

    0

    . On the other hand y =

    0.50

    0.5

    is

    not far from the solution because of Ay =

    1

    1.0005

    1

    . From this one

    sees that the given equation system is very unstable with respect to

    perturbations.

    If one calculates the solution of the slightly perturbated system Ax1 = b1

    with b1 =

    1

    0.9

    1

    one gets x1 =

    101.0000201.0000

    100.0000

    .

    1.9.1.2 Theorem

    Let x be the solution of Ax = b. If we compare the solution x + x of

    the disturbed system A(x+x) = b+bwith x, we get the relative error

    x

    x A

    A1

    b

    b.

    The number (A) = cond A = AA1 is called the condition of A.With a little more efford it is possible to prove

  • 8/2/2019 A Em Theory

    68/124

    AEM 1- 59

    Theorem If x is the solution of Ax = b and x + x the solution of

    (A + A)(x + x) = b+ b, then the following estimate for the relative

    error holds:

    xx

    (A)

    1 (A)AA

    bb +

    AA

    .

    For small values ofA the right side is approximately equal to

    (A)b

    b + A

    A .1.9.2 Numerical Rank Deficiency

    Numerical rank deficiency appears if a matrix is close to another matrix

    with smaller rank. This leads to a very large condition number.

    Small variations in the initial data of Ax = b lead to large variation inthe result x.

    The SVD of A is A = USVT with the singular values s1 10, s2 0.4and s3 1/3000.To avoid this effects one can proceed as follows:

    m1 Decompose A = USVT.

    m2 The matrix S1 is build out ofS by replacing all entries smaller thana given number by zero and A1 = US1V

    T.

    This is reasonable: one can prove that entries in S that are smaller

    than the machine accuracy multiplied by the Frobenius norm of the

    matrix will have no influence on the result.

    m3 Instead of the solutions ofAx = bfind the pseudo-normal solutionsof A1x = b with

    x+ = A+1 b = V S+1 U

    T b

  • 8/2/2019 A Em Theory

    69/124

    AEM 1- 60

    In the example one has A = USVT with

    S = 10.3873 0 0

    0 0.3338 0

    0 0 0.0003 and orthogonal matrices U and V.We change the third singular value to zero an get

    S1 =

    10.3873 0 00 0.3338 0

    0 0 0

    and S+1 =

    0.0963 0 00 2.9961 0

    0 0 0

    .

    Then A+

    1

    = 1.1633 1.1674 1.8314

    0.1669

    0.1676 0.3342

    0.8316 0.8344 1.1662 andx+ = A+1

    11

    1

    =

    0.49920.0002

    0.4997

    and x+1 = A+1

    10.9

    1

    =

    0.38250.0165

    0.4163

    In the original problem we have

    Ax+ =0.99980.9999

    1.0001

    and Ax+1 0.94990.94991.0003

  • 8/2/2019 A Em Theory

    70/124

    AEM 1- 61

    1.9.3 Application: Best Fit Functions

    Other name: Gaua method of least squares

    1.9.3.1 Most important case: best fit straight line

    Starting point are n > 2 pairs of coordinates (xi, yi), so that at least two

    different x-values occur.

    We search for a line y = ax+bwith the property that the quadratic error

    ni=1

    (axi + b) yi

    2is as small as possible.

    The solution of this problem is the pseudo normal solution of

    b+ ax1 = y1...

    b+ axn = yn

    , or A b

    a = ywith A =

    1 x1...

    ...

    1 xn

    and y =

    y1...

    yn

    As the matrix is injective, the solution is obtained with aid of the trans-

    posed matrix:

    b

    a

    = (ATA)1ATy.

    The coefficient of correlations r measures the quality of the approxima-

    tion. Always we have

    |r

    | 1 and for r =

    1 the line goes through all

    points.

  • 8/2/2019 A Em Theory

    71/124

    AEM 1- 62

    Algorithm

    All sums are from i = 1 to n.

    m1 = n

    x2i (

    xi)2

    m2 The best fit straight line y = ax + b has the coefficients

    a =1

    (n

    xiyi

    xi

    yi)

    and b = 1 x2i yi xiyi xi.

    m3 r =n

    xiyi

    xi

    yin

    x2i (

    xi)2

    n

    y2i (

    yi)2

    Second method

    Find the mean values x =1

    n

    nk=1

    xk and y =1

    n

    nk=1

    yk. Shift the coordi-

    nate system so that x and y are the new origin by replacing xk by xk xresp. yk by yk y. Then the best fit straight line is given by

    y = v x with v =

    nk=1

    xk yk

    nk=1

    x2k

    and

    r =

    n

    k=1 xkykn

    k=1

    x2k

    1/2 nk=1

    y2k

    1/2 = x , yx y .

  • 8/2/2019 A Em Theory

    72/124

    AEM 1- 63

    Here it is easy to see that the coefficient of correlation describes the

    relative error in the approximation:

    nk=1

    (v xk yk)2

    nk=1

    y2k

    = 1 r2.

    1.9.3.2 General problem

    Let (xi, yi), i = 1, . . . , n be n pairs of data. Furthermore let f1, . . . , f k be

    k < n functions. We look for a linear combination f(x) =

    kj=1

    jfj(x)

    of the fj so that the sum of the squares of the deviations of f(xi) to yibecomes minimal:

    F =

    ni=1

    (f(xi) yi)2 =n

    i=1

    kj=1

    jfj(xi) yi2 !

    = min.

    Solution: Solve Aa = y. Here a = (1, . . . , k)T contains the coeffi-

    cients we look for and A = f1(x1) f2(x1) fk(x1)f1

    (x2

    ) f2

    (x2

    )

    fk

    (x2

    )...

    .... . .

    ...

    f1(xn) f2(xn) fk(xn)

    and y = y1y2...

    yn

    .

  • 8/2/2019 A Em Theory

    73/124

    AEM 1- 64

    1.10 Symmetric Matrices and Quadratic

    Forms

    A quadratic form on Rn is a map of the form

    x = (x1, . . . , x n)T Q(x) =

    ni ,j=1

    ci j xi xj

    The ci j are real numbers with ci j = cj i. With the symmetric matrix

    C = (ci j)i ,j=1...n this is written as

    Q(x) = xTCx ,

    On the other hand is Q the quadratic form that belongs to C .

    Let C = UDUT with a real diagonal matrix D containing the eigenvalues

    of C and an orthogonal matrix U. Then

    QC(x) = xTCx = xTUDUTx = (UTx)TD(UTx)

    If the columns of U are the (ON-)vectors u1 un, then UTx are thecoefficients of x in this basis. If these are denoted by y1, . . . , y n, then

    with y = UTx one has

    QC(x) = yTDy =

    nk=1

    ky2k.

    From this one has immediately e.g. that QC(x) is positive for non-zero-

    vectors iff all eigenvalues of C are positive.

    This leads to the definition:

    A quadratic form is called

    positive definite

    if Q(x) > 0 for x = 0 > 0 for all eigenvalues of C.

  • 8/2/2019 A Em Theory

    74/124

    AEM 1- 65

    positive semidefinite

    if Q(x) 0 for all x 0 for all eigenvalues of C.

    negative definiteif Q(x) < 0 for x = 0 < 0 for all eigenvalues of C.negative semidefinite

    if Q(x) 0 for all x 0 for all eigenvalues of C.definite

    if Q is negative or positive definite.

    indefiniteif there are x and y with Q(x) < 0 < Q(y)

    the matrix C has positive and negative eigenvalues.(Dangerous) notation: C positive definite: C > 0, C positive semidef-

    inite: C 0, C negative (semi)definite: C < 0 (C 0).A symmetric matrix is called positive/negative (semi)definite or indefi-

    nite, if this is true for the corresponding quadratic form.

    Remark

    A is positive [semi]definite A is negative [semi]definite.

    Hurwitz Criterion

    The Hurwitz Criterion is useful to determine the definiteness of a matrix

    without calculation the eigenvalues.

  • 8/2/2019 A Em Theory

    75/124

    AEM 1- 66

    In the symmetric n n-matrix Aone forms - starting from the left

    upper corner - submatrices of the

    size 1, 2,... n. The determinantsof these submatrices are called

    D1 to Dn. We have D1 = a11,

    D2 = a11a22 a12a21. Dn is thedeterminant of A at last. Then

    the following holds:

    a11

    a21

    a31

    an1 ann

    a1na12 a13

    a22 a23

    a33a32

    .... . .

    (i) D1 > 0, D2 > 0, D3 > 0, D4 > 0 etc.

    A pos. definite.

    (Dk > 0)

    (ii) D1 < 0, D2 > 0, D3 < 0, D4 > 0 etc. A neg. definite.((1)kDk > 0)

    (iii) A pos. semidefinite D1 0, D2 0, D3 0, D4 0 etc.(Dk 0)

    (iv) A neg. semidefinite

    D1

    0, D2

    0, D3

    0, D4

    0 etc.

    ((1)kDk 0)(v) if neither iii) nor iv) holds, A is indefinite.

    Especially A is indefinite, if for an even number k Dk < 0 holds. Please

    pay attention to the fact that A may be indefinite even if always Dk 0or (1)kDk 0 holds. In this case at least one Dk has to be zero.

    Quadratic Completion

    Another possibility to determine the definiteness of a quadratic form is

    quadratic completion. The method is explained at the example

    Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 9z2.

    m1 Choose one variable xj with a non-vanishing coefficient ofx2j . Herewe choose x. If such a choice is impossible, the quadratic form is

    indefinite.

  • 8/2/2019 A Em Theory

    76/124

    AEM 1- 67

    m2 Gather all terms that contain x:

    Q(x) = (x2 + 4x y + 2x z) + (8y2 + 16y z + 9z2)

    m3 Use the following to complete to a square(a + b+ c + d + )2 =a2+b2 +c2+d2 +2(ab+ac+ad+ +bc+bd+ +cd+ )

    Q(x) = (x + 2y + z)2 +

    m4 Subtract the term that are not contained in the bracket in stepm2 :Q(x) = (x + 2y + z)2 + (4y2 z2 4y z) + ( 8y2 + 16y z + 9z2)= (x + 2y + z)2 + (4y2 + 12y z + 8z2)

    m5 Now the second bracket contains no x. Continue with m1 applied

    to the second bracket. Choose y.

    m6 Q(x) = (x + 2y + z)2 + (4y2 + 12y z) + 8z2

    = (x + 2y + z)2 + (2y + 3z)2 9z2 + 8z2= (x + 2y + z)2 + (2y + 3z)2 z2.

    This is a sum of squares with two plus and one minus-sign. This means

    that there are two positive and one negative eigenvalues in the corre-

    sponding matrix, and Q is indefinite.

    Further examples

    Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 10z2 is positive semidefinite,

    and

    Q(x) = x2 + 4x y + 2x z + 8y2 + 16y z + 11z2 is positive definite.

  • 8/2/2019 A Em Theory

    77/124

    AEM 1- 68

    1.11 QR-Decomposition

    Theorem

    Let A be a matrix with m rows and n m columns. Then there existsa orthogonal matrix Q and a upper triangular matrix R with A = QR.

    Upper triangular means that for R = (ri j) one has ri j = 0 for j < i.

    Proof 1 - Jacobi method, Givens rotations

    The case n = 1 or m = 1 is trivial. Now let us first look at the case

    m = 2.

    We are looking for an orthogonal 2 2-matrix Q with A = QR andr21 = 0.

    Q = u vv u with u

    2 + v2 = 1, R = r11 r12 0 r22

    and A =

    a b c d

    leads to

    QTA = R

    u v

    v u

    a b c d

    =

    r11 r12 0 r22

    av + uc = r11 r12 0 r22 .So this can be fulfilled with

    u =a

    a2 + c2and v =

    ca2 + c2

    In the case c = 0 one simply takes Q = E2.

  • 8/2/2019 A Em Theory

    78/124

    AEM 1- 69

    With Q0 = E and R0 = A for each element below the diagonal an

    operation is performed:

    QTi Ri = Ri+1

    . . . 0 a b ... ... c d 0

    Ek 0 00 u 0 v

    .

    ..... 0 Em 0

    ...... v 0 u ...0 Ep

    . . .

    0 r ... ... 0 0

    From this one sees: the same values of u and v as above eliminate the

    c- element with an orthogonal matrix Qi, and the rest of the column

    that contains a and c is not changed.

    So we have A = Q0R0 = Q0R0 = Q0Q1R1 = Q0 QkRk := QR withQ = Q0 Qk and R = Rk.

    Proof 2 - Householder Transformations

    The Jacobi method needs n2

    2 steps. This method uses only n 1steps:

    The idea is to use a series of reflexions that map the parts of the columns

    below the diagonal to zeroes.

    After some steps we habe the matrix

    Rk =

    . . .

    0 | ... bk 0 |

  • 8/2/2019 A Em Theory

    79/124

    AEM 1- 70

    The lower part of column k, bk, shall be mapped onto a multiple of

    ek. Let ck be a vektor equal to bk, but with zeroes in the first k 1

    positions. So defi