new s˝ren h˝jsgaard department of mathematical sciences aalborg...

24
Numerical computations – with a view towards R Søren Højsgaard Department of Mathematical Sciences Aalborg University, Denmark October 8, 2012 Printed: October 8, 2012 File: numComp-slides.tex

Upload: others

Post on 03-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

Numerical computations – with a view towards R

Søren Højsgaard

Department of Mathematical Sciences

Aalborg University, Denmark

October 8, 2012

Printed: October 8, 2012 File: numComp-slides.tex

Page 2: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

2

Contents

1 Computer arithmetic is not exact 3

2 Floating point arithmetic 42.1 Addition and subtraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 The Relative Machine Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62.3 Floating-Point Precision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82.4 Error in Floating-Point Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3 Example: A polynomial 10

4 Example: Numerical derivatives 14

5 Example: The Sample Variance 16

6 Example: Solving a quadratic equation 18

7 Example: Computing the exponential 19

8 Ressources 24

Page 3: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

3

1 Computer arithmetic is not exact

The following R statement appears to give the correct answer.

R> 0.3 - 0.2

[1] 0.1

But all is not as it seems.

R> 0.3 - 0.2 == 0.1

[1] FALSE

The difference between the values being compared is small, but important.

R> (0.3 - 0.2) - 0.1

[1] -2.775558e-17

Page 4: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

4

2 Floating point arithmetic

Real numbers “do not exist” in computers. Numbers in computers are represented

in floating point form

s× be

where s is the significand, b is the base and e is the exponent.

R has “numeric” (floating points) and “integer” numbers

R> class(1)

[1] "numeric"

R> class(1L)

[1] "integer"

Page 5: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

5

2.1 Addition and subtraction

Let s be a 7–digit number

A simple method to add floating-point numbers is to first represent them with the

same exponent.

123456.7 = 1.234567 * 10^5

101.7654 = 1.017654 * 10^2 = 0.001017654 * 10^5

So the true result is

(1.234567 + 0.001017654) * 10^5 = 1.235584654 * 10^5

But the approximate result the computer would give is (the last digits (654) are

lost)

1.235585 * 10^5 (final sum: 123558.5)

In extreme cases, the sum of two non-zero numbers may be equal to one of them

Quiz: how to sum a sequence of numbers x1, . . . , xn to obtain large accuracy?

Page 6: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

6

2.2 The Relative Machine Precision

The accuracy of a floating-point system is measured by the relative machine

precision or machine epsilon.

This is the smallest positive value which can be added to 1 to produce a value

different from 1.

A machine epsilon of 10−7 indicates that there are roughly 7 decimal digits of

precision in the numeric values stored and manipulated by the computer.

It is easy to write a program to determine the relative machine precision

R> .Machine$double.eps

[1] 2.220446e-16

Page 7: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

7

R> macheps <- function(){

eps <- 1

while(1+eps/2 != 1)

eps <- eps / 2

eps

}

R> macheps()

[1] 2.220446e-16

Page 8: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

8

2.3 Floating-Point Precision

The preceding program shows that there are roughly 16 decimal digits of precision

to R arithmetic.

It is possible to see the effects of this limited precision directly.

R> a = 12345678901234567890

R> print(a, digits=20)

[1] 12345678901234567168

The effects of finite precision show up in the results of calculations.

Page 9: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

9

2.4 Error in Floating-Point Computations

Numbers are accurate to about 15 significant digits.

Subtraction of positive values is one place where the finite precision of

floating-point arithmetic is a potential problem.

R> x = 1+1.234567890e-10

R> print(x, digits = 20)

[1] 1.0000000001234568003

R> y = x - 1

R> print(y, digits = 20)

[1] 1.234568003383174073e-10

There are 16 correct digits in x, but only 6 correct digits in y.

Subtraction of nearly equal quantities (known as near cancellation) is a major

source of inaccuracy in numerical calculations and requires special care.

Page 10: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

10

3 Example: A polynomial

The function

f(x) = x7 − 7x6 + 21x5 − 35x4 + 35x3 − 21x2 + 7x− 1

is a 7th degree polynomial, and its graph should appear very smooth.

To check this we can compute and graph the function over a range of values

R> x = seq(.988, 1.012, by = 0.0001)

R> y = x^7 - 7*x^6 + 21*x^5 - 35*x^4 + 35*x^3 - 21*x^2 + 7*x - 1

R> plot(x, y, type = "l")

Page 11: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

11

0.990 0.995 1.000 1.005 1.010

−4e

−14

0e+

004e

−14

x

y

Not very smooth!

Page 12: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

12

To see where the cancellation error comes split the polynomial into individual terms

and see what happens when we sum them.

R> x = .99

R> y = c(x^7, - 7*x^6, + 21*x^5, - 35*x^4, 35*x^3, - 21*x^2, + 7*x, - 1)

R> y

[1] 0.9320653 -6.5903610 19.9707910 -33.6208604 33.9604650 -20.5821000

[7] 6.9300000 -1.0000000

R> cumsum(y)

[1] 9.320653e-01 -5.658296e+00 1.431250e+01 -1.930837e+01 1.465210e+01

[6] -5.930000e+00 1.000000e+00 -1.521006e-14

It is the last subtraction (of 1) which causes the catastrophic cancellation and loss

of accuracy.

We can reformulate the problem by noticing that

f(x) = x7 − 7x6 + 21x5 − 35x4 + 35x3 − 21x2 + 7x− 1 = (x− 1)7

Notice that although we are still getting cancellation, when 1 is subtracted from

values close to 1, we are only losing a few digits of accuracy.

Page 13: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

13

The difference is apparent in the plot.

R> x = seq(.988, 1.012, by = 0.0001)

R> y = (x - 1)^7

R> plot(x, y, type = "l")

0.990 0.995 1.000 1.005 1.010

−3e

−14

0e+

003e

−14

x

y

Page 14: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

14

4 Example: Numerical derivatives

The derivative f ′(x) may be approximated by

f ′(x) ≈f(x + h/2)− f(x− h/2)

h, h small

For small h we get near cancellation errors. A generic R function is

R> numDeriv <- function(f, x, h=1e-8){

(f(x+h/2)-f(x-h/2))/h

}

Find derivative of exponential at x = 1:

R> g <- function(x){exp(x)}

R> print(numDeriv(g, 1), digits=20)

[1] 2.7182818218562943002

R> print(exp(1), digits=20)

[1] 2.7182818284590450908

Page 15: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

15

Try range of h values:

R> hvec <- seq(5e-9, 1e-6, 1e-8)

R> dvec <- numDeriv(g, 1, hvec)

R> plot(hvec, (dvec-exp(1))/exp(1), type='l')R> abline(h=0, col='red')

0e+00 2e−07 4e−07 6e−07 8e−07 1e−06

−5e

−09

5e−

09

hvec

(dve

c −

exp

(1))

/exp

(1)

Page 16: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

16

5 Example: The Sample Variance

Consider the formula for the sample variance

1

n− 1

n∑i=1

(xi − x̄)2 =1

n− 1

( n∑i=1

x2i − nx̄2

)

The left-hand side of this equation provides a much better computational

procedure for finding the sample variance than the right-hand side.

If the mean of xi is far from 0, then∑

i x2i and nx̄2 will be large and nearly equal

to each other. The relative error which results from applying the right-hand side

formula can be very large.

There can, of course, be loss of accuracy using the formula on the left, but it is not

nearly as severe.

Page 17: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

17

R> sdval <- 5

R> x <- rnorm(10, mean=1000000, sd=sdval)

R> x.bar <- mean(x)

R> n <- length(x)

R> lhs <- sum((x-x.bar)^2)/(n-1)

R> rhs <- (sum(x^2)-n*x.bar^2)/(n-1)

R> print(lhs, digits=10)

[1] 29.51578653

R> print(rhs, digits=10)

[1] 29.51584201

R> print(rhs-lhs, digits=10)

[1] 5.548186877e-05

Page 18: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

18

6 Example: Solving a quadratic equation

Consider solving

ax2 + bx + c = 0

Letting D = b2 − 4ac, the roots are

r1 =−b +

√D

2a, r2 =

−b−√D

2a

If b2 � ac then√D =

√b2 − 4ac ≈ |b|.

If b > 0 then −b +√D involves a near cancellation (same for −b−

√D if b < 0).

Rewrite the problem: Multiply numerator and dominator of r1 by −b +√D (and

of r2 by −b +√D) by to obtain

r1 =2c

−b−√D

, r2 =2c

−b +√D

Page 19: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

19

7 Example: Computing the exponential

The exponential function is defined by the power series

exp(x) =n∑

n=0

xn

n!

Letting tn = xn

n!we have tn+1 = tn

xn+1

so these terms must eventually become

small.

One strategy for summing the series is to accumulate terms of the series until the

terms become so small that they do not change the sum.

Page 20: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

20

R> expf <- function(x){

n <- 0

term <- 1

ans <- 1

while(abs(term)> .Machine$double.eps) {

n = n + 1

term = term * x / n

ans <- ans + term

}

ans

}

Page 21: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

21

Compare expf() with R’s built in function. For positive values, the results are

good:

R> (expf(1) - exp(1))/exp(1)

[1] 1.633713e-16

R> (expf(20) - exp(20))/exp(20)

[1] -1.228543e-16

For negative values less so:

R> (expf(-1) - exp(-1))/exp(-1)

[1] 3.017899e-16

R> (expf(-20) - exp(-20))/exp(-20)

[1] 1.727543

Why?

Page 22: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

22

When x < 0 the terms in

exp(x) =n∑

n=0

xn

n!

alternate in sign.

For large negative x values, the value returned by the function is small while at

least some of the terms are large.

Hence, at some point, there has to be near-cancellation of the accumulated sum

and the next term of the series. This is the source of the large relative error.

Notice: When the argument to expf() is positive, all the terms of the series are

positive and there is no cancellation.

Page 23: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

23

There is an easy remedy:

Since exp(−x) = 1/ exp(x) it is for negative x better to compute the result as

exp(x) = 1/ exp(−x)

R> expf <- function(x){

if (x<0) {

1/expf(-x)

} else {

n <- 0; term <- 1; ans <- 1

while(abs(term)> .Machine$double.eps) {

n = n + 1

term = term * x / n

ans <- ans + term

}

ans

}

}

R> (expf(-1) - exp(-1))/exp(-1)

[1] -1.50895e-16

R> (expf(-20) - exp(-20))/exp(-20)

[1] 2.006596e-16

Page 24: New S˝ren H˝jsgaard Department of Mathematical Sciences Aalborg …people.math.aau.dk/.../2012-ASC/day1/numComp-slides.pdf · 2012. 10. 8. · Numerical computations { with a view

24

8 Ressources

What Every Computer Scientist Should Know About Floating-Point Arithmetic:

http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html