cse 246: computer arithmetic algorithms and hardware design

CSE 246: Computer Arithmetic Algorithms and Hardware Design

Instructor:Prof. Chung-Kuan Cheng

Winter 2004

Lecture 9

CSE 246 2

Topics:

Floating Point Numbers (IEEE P754) Standard Operations Exceptional Situations Rounding Modes

CSE 246 3

Standard

232 Typically

Goal: Dynamic Range:

largest #/ smallest #

If too large, holes between #’s

CSE 246 4

Standard ulp (unit in the last place)

Difference between two consecutive values of the significand.

3 Parts x = s be

Sign Bit

8-bit exponent

Significand

CSE 246 5

Standard a1a2a3a4a5a6a7a8b1b2b3b22b23

1.* normalized number 0.* denormalized number0 0.b1b2b3b22b23 2-126

1 --------------------------------- 1. b1b2b3b22b23 2-126

2...253254 ------------------------------- 1. b1b2b3b22b23 2127

if bi = 0 for all i = 1,2,…,23, NaN otherwise

NaN Not a Number

CSE 246 6

Standard

0.01x2-3 = 0.00x2-2

Same number, so normalize to remove redundancy

Smallest Number0.00…01x2-126 = 1.0x2-23x2-126

= 1x2-149

1.1101111001110011100101

Difference between 2 #’s small for normalized

0.0001 2 times compared to magnitudes

0.0010

CSE 246 7

Standard - Examples. eeeeeeee nnnnnnnnnnnnnnnnnnnnnnn0.00000000 00000000000000000000000 = 0.000…0x2-126

1.00000000 00000000000000000000000 = 0

0.00000001 00000000000000000000000 = 1.000…0x2-126

- minimal normalized #

0.00000001 00000000000000000000001 = 1.000…1x2-126

.

.

.

0.01111111 00000000000000000000001 = 1.000…1x20

0.10000000 00000000000000000000001 = 1.000…0x21

CSE 246 8

Standard – Example Cont.0.11111110 00000000000000000000001 = 1.000…1x2127

0.11111110 11111111111111111111111 = 1.111…1x2127

- Normalized Maximum

0.11111111 00000000000000000000000 =

Nmin = 1.0 x 2-126

Nmax = (2 – 2-23)2127

CSE 246 9

Double Floating Point a1a2…a11b1b2…b52

000…00 0. b1b2…b52 x 2-1023

000…01 1. b1b2…b52 x 2-1022

.

.

.

011…11 1. b1b2…b52 x 20

100…00 1. b1b2…b52 x 21

.

.

.

111…10 1. b1b2…b52 x 21023

111…11 = if bi = 0 for all i = 1,2,…,52

CSE 246 10

Overflow/Underflow

NmaxNmin

SparserDenser

Overflow

Underflow

CSE 246 11

Addition/Multiplication s1xbe1 + (s2xbe2) = sxbe

= s1xbe1 + s2/be1-e2 x be1

= (s1 s2/be1-e2) x be1

(s1xbe1) x (s2xbe2) = (s1xs2)be1+e2

CSE 246 12

Exceptions

a/0 = if a > 0a/ = 0 if a != 0a·0 = 0a· = if a > 00· = invalid operation (NaN)0/0 = invalid operation (NaN)NaP op a = NaNa + = - = NaN

CSE 246 13

Rounding Mode Adder Output = Cout z1z0.z-1z-2…z-l GRS

Guard BitRound BitSticky Bit, OR of all bits below bit R

1.101 x 23

+1.110 x 23

11.011 x 23

1.1011x24 Normalize – need to round or

CSE 246 14

Rouding1.110 x 23

- 1.101 x 23

0.001 x 23

1.000 x 20 normalize

1.101 x 23

- 1.111 x 22

1.101 x 23

- 0.1111 x 23

0.1101 x 23

1.101 x 22

Guard bit

CSE 246 15

Rounding Round to the nearest even

toward 0 1.1011 Toward + 1.1100 Toward - 1.1011

CSE 246 16

Conventional Rounding Error

Rounding Error

1.10100 1.101 = 01.10101 1.101 = -0.251.10110 1.110 = +0.51.10111 1.110 = +0.25

Average Error = 0.5/4 = 0.125

cse 246: computer arithmetic algorithms and hardware design

Documents