real number representation (lecture 25 of the introduction to computer programming series) dr damian...

57
Real Number Representation (Lecture 25 of the Introduction to Computer Programming series) Dr Damian Conway Room 132 Building 26

Post on 20-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Real Number Representation

(Lecture 25 of the Introduction to Computer Programming series)

Dr Damian ConwayRoom 132

Building 26

Some Terminology

• All digits in a number following any leading zeros are significant digits:

12.345 -0.12345 0.00012345

Some Terminology

• The scientific notation for real numbers is:

mantissa base exponent

Some Terminology

• The mantissa is always normalized between 1 and the base (i.e. exactly one significant figure before the point):

Normalized Unnormalized

2.9979 108

2997.9 105

B.139FC 1612

B1.39FC 1611

1.0110110101 2-3

0.010110110101 2-1

Some Terminology

• The precision of a number is how many digits (or bits) we use to represent it.

• For example:

33.143.14159263.1415926535897932384626433832795028

Representing numbers

• A real number n is represented by a floating-point approximation n*

• The computer uses 32 bits (or more) to store each approximation.

• It needs to store the mantissa, the sign of the mantissa, and the exponent (with its sign).

Representing numbers

• So it has to allocate some of its 32 bits to each task.

• The standard way to do this (specified by IEEE standard 754) is:

313002223

Representing numbers

• 23 bits for the mantissa;

• 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude);

• The remaining 8 bits for the exponent.

313002223

Representing numbers

• 23 bits for the mantissa;

• 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude);

• The remaining 8 bits for the exponent.

313002223

Representing numbers

• 23 bits for the mantissa;

• 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude);

• The remaining 8 bits for the exponent.

313002223

Representing numbers

• 23 bits for the mantissa;

• 1 bit for the mantissa's sign (i.e. the mantissa is signed magnitude);

• The remaining 8 bits for the exponent.

Representing the mantissa

• Since the mantissa has to be in the range 1 ≤ mantissa < base, if we use base 2 the digit before the decimal has to be a 1.

• So we don't have to worry about storing it!

• That way we get 24 bits of precision using only 23 bits.

Representing the mantissa

• Those 24 bits of precision are equivalent to a little over 7 decimal digits:

24

log2 10≈7.2

Representing the mantissa

• Suppose we want to represent :

3.1415926535897932384626433832795.....

• That means that we can only represent it as:

3.141592 (if we truncate)3.141593 (if we round)

Representing the mantissa

• Even if the computer appears to give you more than seven decimal places, only the first seven are meaningful.

• For example:

#include <math.h>

main(){

float pi = 2 * asin(1);printf("%.35f\n", pi);

}

Representing the mantissa

• On my machine this prints out:

3.1415927419125732000000000000000000

Representing the mantissa

• On my machine this prints out:

3.1415927419125732000000000000000000

Representing the exponent

• The exponent is represented as an excess-127 number.

• That is:

00000000 –12700000001 –12601111111 010000000 +111111111 +128

Representing the exponent

• However, the IEEE standard restricts exponents to the range:

–126 ≤ exponent ≤ +127

• The exponents –127 and +128 have special meanings (basically, zero and infinity respectively)

Floating point overflow

• Just like the integer representations in the previous lecture, floating point representations can overflow:

9.999999 10127

+ 1.111111 10127

               

1.1111110 10128

Floating point overflow

• Just like the integer representations in the previous lecture, floating point representations can overflow:

9.999999 10127

+ 1.111111 10127

               

1.1111110 10128

Floating point overflow

• Just like the integer representations in the previous lecture, floating point representations can overflow:

9.999999 10127

+ 1.111111 10127

               

Floating point underflow

• But floating point numbers can also get too small:

1.000000 10-126

÷ 2.000000 100

               

5.000000 10-127

Floating point underflow

• But floating point numbers can also get too small:

1.000000 10-126

÷ 2.000000 100

               

5.000000 10-127

Floating point underflow

• But floating point numbers can also get too small:

1.000000 10-126

÷ 2.000000 100

               

0

Floating point addition

• Five steps to add two floating point numbers:– Express them with the same exponent

(denormalize)– Add the mantissas– Adjust the mantissa to one digit/bit

before the point (renormalize)– Round or truncate to required precision.– Check for overflow/underflow

Floating point addition example

x = 9.876 107

y = 1.357 106

Floating point addition example

1. Same exponents:

x = 9.876 107

y = 0.1357 107

Floating point addition example

2. Add mantissas:

x = 9.876 107

y = 0.1357 107

x+y = 10.0117 107

Floating point addition example

3. Renormalize sum:

x = 9.876 107

y = 0.1357 107

x+y = 1.00117 108

Floating point addition example

4. Trucate or round:

x = 9.876 107

y = 0.1357 107

x+y = 1.001 108

Floating point addition example

5. Check overflow and underflow:

x = 9.876 107

y = 0.1357 107

x+y = 1.001 108

Floating point addition example 2

x = 3.506 10-5

y = -3.497 10-5

Floating point addition example 2

1. Same exponents:

x = 3.506 10-5

y = -3.497 10-5

Floating point addition example 2

2. Add mantissas:

x = 3.506 10-5

y = -3.497 10-5

x+y = 0.009 10-5

Floating point addition example 2

3. Renormalize sum:

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8

Floating point addition example 2

4. Trucate or round:

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8 (no

change)

Floating point addition example 2

5. Check overflow and underflow:

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8

Floating point addition example 2

Question: should we believe these zeroes?

x = 3.506 10-5

y = -3.497 10-5

x+y = 9.000 10-8

Floating point multiplication

• Five steps to multiply two floating point numbers:– Multiply mantissas– Add exponents– Renormalize mantissa– Round or truncate to required

precision.– Check for overflow/underflow

Floating point multiplication example

x = 9.001 105

y = 8.001 10-3

1&2. Multiply mantissas/add exponents:

x = 9.001 105

y = 8.001 10-3

x y = 72.017001 102

Floating point multiplication example

3. Renormalize product:

x = 9.001 105

y = 8.001 10-3

x y = 7.2017001 103

Floating point multiplication example

4. Trucate or round:

x = 9.001 105

y = 8.001 10-3

x y = 7.201 103

Floating point multiplication example

4. Trucate or round:

x = 9.001 105

y = 8.001 10-3

x y = 7.202 103

Floating point multiplication example

5. Check overflow and underflow:

x = 9.001 105

y = 8.001 10-3

x y = 7.202 103

Floating point multiplication example

Limitations

• Float-point representations only approximate real numbers.

• The normal laws of arithmetic don't always hold (even less often than for integer representations).

• For example, associativity is not guaranteed:

Limitations

x = 3.002 103

y = -3.000 103

z = 6.531 100

Limitations

x = 3.002 103

x+y = 2.000 100

y = -3.000 103

z = 6.531 100

Limitations

x = 3.002 103

x+y = 2.000 100

y = -3.000 103

(x+y)+z = 8.531 100

z = 6.531 100

Limitations

x = 3.002 103

y = -3.000 103

z = 6.531 100

Limitations

x = 3.002 103

y = -3.000 103

y+z = -2.993 103

z = 6.531 100

Limitations

x = 3.002 103

x+(y+z) = 0.009 103

y = -3.000 103

y+z = -2.993 103

z = 6.531 100

Limitations

x = 3.002 103

x+(y+z) = 9.000 100

y = -3.000 103

y+z = -2.993 103

z = 6.531 100

Limitations

x = 3.002 103

x+(y+z) = 9.000 100

y = -3.000 103

(x+y)+z = 8.531 100

z = 6.531 100

Limitations

• Consider the other laws of arithmetic:– Commutativity (additive and multiplicative)– Associativity – Distributivity– Identity (additive and multiplicative)

• Spend some time working out which ones (if any!) always hold for floating-point numbers.

Reading (for the very keen)

Goldberg, D., What Every Computer Scientist Should Know About Floating-Point Arithmetic, ACM Computing Surveys, Vol.23, No.1, March 1991.