linear algebra slides - ws.amsi.org.au · london millenium bridge, wobbling (compare tacoma...

Linear AlgebraSlides

Linda Stalsemail:[email protected]

Winter School 2017

(AMSI) Linear Algebra Slides Winter School 2017 1 / 35

Why Worry?

References

Numerical Linear Algebra by Lloyd N. Trefethen and David Bau, III. SIAM1997.

Applied Numerical Linear Algebra by James W. Demmel. SIAM 1997.


Arithmetic Disasters

http://ta.twi.tudelft.nl/nw/users/vuik/wi211/disasters.html

Patriot Missile Failure.Explosion of the Ariane 5.EURO page: Conversion Arithmetics.The Vancouver Stock Exchange.Rounding error changes Parliament makeup.The sinking of the Sleipner A offshore platform.Tacoma bridge failure (wrong design).Collection of Software Bugs.


http://ta.twi.tudelft.nl/nw/users/vuik/wi211/disasters.html

Poor Programming

http://www5.in.tum.de/~huckle/bugse.html

Hammer throwing, London olympics, (Software would not accept athlete’sresults because it was exactly the same as the previous athlete’s results, 2012).Mars Climate Orbiters, Loss (Mixture of lb and kg, 1999).Green Party Convent fails (By rounding error and erronous use of Excel thewrong number of delegates is computed, 2002).London Millenium Bridge, wobbling (compare Tacoma Bridge). (Simulationfails because of wrong estimates for pedestrian forces, 2000).Vancouver Stock Exchange Index (Rounding Error, 1983).Shut down of Nuclear Reactors (Use of wrong norm in CAD system, 1979).Ozone Hole ignored until 1985 (Software had to set aside data points thatdeviated greatly from expected measurements).


http://www5.in.tum.de/~huckle/bugse.html

Sample Variance Example

Definition (Two pass sample variance calculation)

x =1

n

n∑i=1

xi ,

s2n =1

n − 1

n∑i=1

(xi − x)2.

Definition (One pass sample variance calculation)

s2n =1

n − 1

n∑i=1

x2i −

1

n

(n∑

i=1

xi

)2 .

One pass definition gives bad values in the presence of round-off errors.


Example (One pass method)

I wrote a C code to calculate the sample variance of the three numbers{784318, 784319, 784320}. When using single floating point precision the variancecalculated from the one pass method was -65536. The sample variance using thetwo pass method was 1 (the correct answer). When I tried double precision bothmethods gave an answer of 1.


Some Sources of Error

Definition (Truncation Error)

Truncation (or discresisation) error is the difference between the true result andthe result that would be given if exact arithmetic is used. Eg. truncating aninfinite series.

Definition (Rounding Error)

The rounding error is the difference between the results obtained by a particularalgorithm using exact arithmetic and the results obtained by the same algorithmusing finite precision.


Example (Simpson’s Rule)

Consider for example, Simpson’s Rule. We know that it is an O(h5) method andsaying that it is O(h5) gives information about the truncation error. But when wetry to solve such problems on the computer extra sources of error due to floatingpoint arithmetic are being introduced.


Floating Point Arithmetic

Due to rounding error, arithmetic operations on computers are not (always) exact.

Definition (Floating Point Arithmetic)

We shall denote an evaluation of an expression in floating point arithmetic by fl .If � represents the basic arithmetic operations +,−,×, / then

fl(x � y) = (x � y)(1 + δ), |δ| ≤ u

where u is the ‘unit roundoff’.

The round off error is then |x � y − fl(x � y)|.


Example (Inner product)

Consider the inner product

sn = xTy = x1y1 + · · ·+ xnyn. (1)

Lets assume that we are summing from left to right. Define the partial sum si bysi = x1y1 + x2y2 + · · ·+ xiyi . Now,

s1 := fl (x1y1) = x1y1 (1 + δ1) .


s2 := fl (s1 + fl(x2y2))

= fl (s1 + x2y2(1 + δ2))

= (s1 + x2y2(1 + δ2)) (1 + δ3)

= (x1y1(1 + δ1) + x2y2(1 + δ2)) (1 + δ3)

= x1y1(1 + δ1)(1 + δ3) + x2y2(1 + δ2)(1 + δ3), where |δi | < u.

Drop the subscripts and let 1 + δi ≡ 1± δ. Then,

s3 := fl (s2 + x3y3)

= (s2 + x3y3(1± δ)) (1± δ)

=(x1y1(1± δ)2 + x2y2(1± δ)2 + x3y3(1± δ)

)(1± δ)

= x1y1(1± δ)3 + x2y2(1± δ)3 + x3y3(1± δ)2


and in general,

sn = x1y1(1± δ)n + x2y2(1± δ)n + x3y3(1± δ)n−1 + · · ·+ xnyn(1± δ)2.

Finally, by using the lemma given below we get

sn = x1y1(1 + θn) + x2y2(1 + θ′n) + x3y3(1 + θn−1) + · · ·+ xnyn(1 + θ2),

where |θn| ≤ nu/(1− nu).In otherwords sn = xT y where

y = y1(1 + θn) + y2(1 + θ′n) + y3(1 + θn−1) + · · ·+ yn(1 + θ2).


Lemma

If |δ| ≤ u and pi = ±1 for i = 1, · · · , n and nu < 1 then

n∏i=1

(1 + δi )pi = 1 + θn,

where|θn| ≤

nu

1− nu=: γn.


Forward and Backward Errors

Forward Errors: Relative and Absolute Errors.

Backward Errors: What is x such that f (x) = f (x) ?

I@@@@@@R

....................................................................................................

hhhhhhhhhhhhhhhhhhhhh

@@@R

Ix y = f (x)

y = f (x + δx)

x + δx

backward errorforward error


For example,

sn = x1y1(1 + θn) + x2y2(1 + θ′n) + x3y3(1 + θn−1) + · · ·+ xnyn(1 + θ2)

is a backward error result. sn is the exact result for the perturbed set of datapoints

x1, x2, · · · , xn, y1(1 + θn), y2(1 + θ′n), · · · , yn(1 + θ2).

The forward error is

|xTy − fl(xTy)| ≤ γnn∑

i=1

|xiyi |.


Example (Exponential)

f (x) = ex = 1 + x +x2

2!+

x3

3!+

x4

4!+ · · ·

f (x) = 1 + x +x2

2!+

x3

3!.

If x = 1, then to seven decimal places;

f (x) = 2.718282 and f (x) = 2.666667.

Furthermore,x = log(2.666667) = 0.980829.

So, the forward error is |f (x)− f (x)| = 0.051615 and the backward error is|x − x | = 0.019171.


Conditioning

How sensitive is the solution to perturbations in the data

insensitive ←→ well conditioned.

sensitive ←→ ill-conditioned.

Relates forward and backward errors.If we include a pertubation in the data δx the relative error is

|f (x + δx)− f (x)||f (x)|

=1

|f (x)||f (x + δx)− f (x)|

|δx ||δx |

≈ 1

|f (x)||f ′(x)||δx |

=|f ′(x)||x ||f (x)|

|δx ||x |

= Condition Number|δx ||x |


Definition (Condition Number)

Condition Number =|relative change in solution||relative change in input data|

=|(f (x)− f (x))/f (x)||(x − x)/x |

≈ |(x − x)f ′(x)/f (x)||(x − x)/x |

=

∣∣∣∣x f ′(x)

f (x)

∣∣∣∣


Example (Exponential)

For example, consider f (x) = ln(x), then

c = Condition Number =|1/x ||x || ln(x)|

=1

| ln(x)|,

which is large if x ≈ 1.

Suppose x = 2 and set x = 2.02 so that the relative input error is 0.01. Then|f (x)− f (x)|/|f (x)| is 0.0143553.If x = 1.5 and x = 1.515 (relative input error remains as 0.01), then|f (x)− f (x)|/|f (x)| is 0.0245405.If we set x = 1.01, moving closer to 1, and have x = 1.0201, then|f (x)− f (x)|/|f (x)| is 1.00000.


Horners Method

LetPn−1(t) = x0 + x1t + · · ·+ xn−1tn−1

and set an−1 = xn−1.If ak = xk + ak+1t0 for k = n − 2, n − 3, · · · , 0 then

a0 = Pn−1(t0).

Moreover, if

Qn−2(t) = an−1tn−2 + an−2tn−3 + · · ·+ a2t + a1

thenPn−1(t) = (t − t0)Qn−2(t) + a0.


Algorithm

Horner’s Method1: p = xn2: for i = n − 1 down to 0 do3: p = t ∗ p + xi4: end for

Lets apply the algorithm to p(t) = (t − 2)9 =t9 − 18t8 + 144t7 − 672t6 + 2016t5 − 4032t4 + 5376t3 − 4608t2 + 2304t − 512.


1.92 1.94 1.96 1.98 2.00 2.02 2.04 2.06 2.08−14e−11

−10e−11

−6e−11

−2e−11

2e−11

6e−11

10e−11

14e−11

Plot of p(t) using (t − 2)9 evaluated at 8000 equidistant points.


1.92 1.94 1.96 1.98 2.00 2.02 2.04 2.06 2.08−14e−11

−10e−11

−6e−11

−2e−11

2e−11

6e−11

10e−11

14e−11

18e−11

Plot of p(t) using Horners method evaluated at 8000 equidistant points.


Recall

Condition Number =

∣∣∣∣t f ′(t)

f (t)

∣∣∣∣ =t × 9(t − 2)8

(t − 2)9=

∣∣∣∣ 9t

t − 2

∣∣∣∣ .so we would expect this to be illconditioned around t = 2. That is what we haveseen in this case.Note that the condition number depends on the problem not the method that isused.


Lets rewrite Horners method as

Algorithm

Horner’s Method - Rewrite1: pn = xn2: for i = n − 1 down to 0 do3: pi = t ∗ pi+1 + xi4: end for


Now use floating point arithmetic to give

Algorithm

Horner’s Method - fp1: pn = xn2: for i = n − 1 down to 0 do3: pi = ((t ∗ pi+1)(1 + δi ) + xi )(1 + δ′i )4: end for

where |δi |, |δ′i | < u.Expanding that out we get

p(t) =n−1∑i=0

(1 + δ′i )i−1∏j=0

(1 + δj)(1 + δ′j )

xi ti +

n−1∏j=0

(1 + δj)(1 + δ′j )

xntn.


This can be simplified to give

p(t) =n−1∑i=0

(1 + 2θi )xi ti

=n−1∑i=0

xi ti

where |θi | ≤ iu/(1− iu) ≤ nu/(1− nu) = γn.So the computed solution p(t) is the correct solution of a slightly peturbedpolynomial with coefficients xn. This is a backward stable method and the relativebackward error is 2γn.


Stability

Recall that the relative error is defined to be

||f (x)− f (x)||||f (x)||

.

Definition (Stable)

We say that an algorithm is stable if for each x ∈ X

||f (x)− f (x)||||f (x)||

= O(u)

for some x with||x − x ||||x ||

= O(u).

u is the unit roundoff.


Backward and Forward Stability

Definition (Backward Stable)

We say that an algorithm f is backward stable if for each x ∈ X

f (x) = f (x)

for some x with||x − x ||||x ||

= O(u).


Definition (Forward Stable)

A method is forward stable if it gives forward errors with similar magnitude(taking the condition number into account) to those produced by a backwardstable method. That is

forward error<∼ backward error× condition number


Inner Product

Recall that

fl(x∗y) = (x + ∆x)∗y= x∗(y + ∆y)

where||∆x || ≤ γn||x ||

and||∆y || ≤ γn||y ||,

where γn = nu/(1− nu).Hence the inner product is backward stable.


Outer Product

The outer product A = xy∗ is stable but not backward stable.


Backward Stability and Relative Error

TheoremSuppose a backward stable algorithm is applied to solve a problem f : X → Ywith condition number κ. Then the relative errors satisfy

||f (x)− f (x)||||f (x)||

= O(κ(x)u).


Matrix Norms

Definition (Matrix Norms)

Given a vector norm, we can define the corresponding matrix norms as follows;

‖A‖ = max‖x‖6=0

‖Ax‖‖x‖

.

These norms are subordinate to the vector norms.For the 1-norm and ∞-norm these simplify to;

‖A‖1 = maxj∑n

i=1 |aij |.‖A‖∞ = maxi

∑ni=j |aij |.


Condition Number of a matrix

Let b be fixed and consider the problem of computing x = A−1b, where A issquare and nonsingular.

Definition (Condition Number)

The condition number of this problem with respect to perturbations in A is

κ = ‖A‖∥∥A−1

∥∥ = κ(A).

If ‖ · ‖ = ‖ · ‖2, then ‖A‖ = σ1 and∥∥A−1

∥∥ = 1/σm where σ1 is the maximumsingular value and σm is the minimum singular value. So

κ(A) =σ1σm

.

For a rectangular matrix A ∈ Cm,n of full rank, m ≥ n,

κ = ‖A‖∥∥A+

∥∥ = κ(A).


linear algebra slides - ws.amsi.org.au · london millenium bridge, wobbling (compare tacoma...

Documents