lecture11 assembly language

COMPUTER ORGANIZATION

AND ASSEMBLY LANGUAGE

Lecture 11

Floating Point (Real) Numbers

We need a way to represent

numbers with fractions, e.g., 3.1416

very small numbers, e.g., .000000001

very large numbers, e.g., 3.15576 109

Representation:

sign, exponent, significand: (–1)sign significand 2exponent

more bits for significand gives more accuracy

more bits for exponent increases range

IEEE 754 floating point standard:

single precision: 8 bit exponent, 23 bit significand

S

IEEE 754 floating-point standard

IEEE Floating Point: (-1)S 1.M 2 E-127

Mantissa (sign and magnitude -- S, M)

Exponent (excess 127 binary number -- E)

E M

1 bit 8 bits 23 bits

• Example:– decimal: -.75 = - ( ½ + ¼ )

– binary: -.11 = -1.1 x 2-1

– floating point: exponent = 126 = 01111110

– IEEE single precision:

1 01111110 10000000000000000000000

Remember

For FP Number <SEM>

Value = (-1)S*1.M*2E-127

1st bit is the sign bit

Exponent = E-127

Mantissa is after the decimal

1.1100111 Mantissa is 1100111

IEEE FP Numbers

Example: 0 1000 0011 1101 0000 0000 0000 0000 000

S = 0 (positive mantissa), E = 131 - 127 = 4, and M = .1101

the number = 1.1101 x 24 = 11101 = 29

Example: 1 0111 1011 1101 0000 0000 0000 0000 000

S = 1 (negative mantissa), E = 123 - 127 = -4, and M = .1101

the number = -1.1101 x 2-4 = -0.00011101 = -(2-4+ 2-5+ 2-6+ 2-8)

= -0.11328125

Example: -10.609375

10.609375 = (1010.100111)2 = 1.010100111 x 23

S = 1, E = 3+127 = 130 = 1000 0010,

and M = 010100111 ...

FP: 1 1000 0010 0101 0011 1000 0000 0000 000

0 .609375 x 2

1 .21875 x 2

0 .4375 x 2

0 .875 x 2

1 .75 x 2

1 .5 x 2

1 .0

Example IEEE-decimal

conversion

1 0111 1100 1100 0000 0000 0000 0000 000

First convert each individual field to decimal.

The sign bit S is 1

The E field contains 01111100= 124

The Mantissa (M) is 0.11000…= 0.75

Thus

-1.75 ×2-3 = -0.21875 (? Not verified)

Converting decimal number to

IEEE 754

What is the single-precision representation of 347.625?

First convert the number to binary: 347.625= 101011011.1012.

Normalize the number by shifting the binary point until there is a single 1 to the left:

101011011.101 ×20= 1.01011011101 ×28

The bits to the right of the binary point, 010110111012, comprise the fractional field f.

The number of times you shifted gives the exponent. In this case, the field e should contain 8 + 127 = 135 = 100001112.

The number is positive, so the sign bit is 0.

The final result is:

0 1000 0111 0101 1011 1010 0000 0000 000

FP Addition

Compute 9.999*101 + 1.610*10-1, assuming only 3 places in decimal

1. Make the exponents same• 9.999*101 + 0.01610*101

2. Add the significands• 10.015*101

3. Normalize• 1.0015*102

4. Round-off• 1.002*102

FP Addition

Example

1.1101 x 23 + 1.001 x 2-1

= 1.1101 x 23 + 0.0001001x23 =1.1110001 x 23

Steps: X(m,e) + Y(m,e)

Shift the smaller operand --- align binary point

compute Ye-Xe, right shift Xm Xm’ if Ye > Xe

Add magnitudes (Xm’ + Ym)

Normalize and round-off if necessary

FP Subtraction

0.510 – 0.437510

Step 1: Convert into FP Representation

0.510 = 1.000 x 2-1

-0.437510 = -1.110x2-2

Step 2: Shift significand of smaller number

-1.110x2-2 = -0.111x2-1

Step 3: Subtract the significands

1.000 x 2-1 - 0.111x2-1 = 0.001x2-1

Step 4: Normalize the sum, check for over/underflow

= 0.001x2-1 = 1.000x2-4

Step 5: Round off

1.000x2-4

Step 6: Result

1.000x2-4 = 0.0625

FP Multiply

Compute 1.110*1010 *9.200*10-5 assuming digit significands

1. Add the exponents

• Exponent = 10-5 = 5

2. Multiply the significands

• 1.110*9.200 = 10.212000

3. Normalize

• 1.0212000*106

4. Round-off

• 1.021*106

FP Multiply

Assume 4-bit mantissa

compute 0 0111 1110 0101 x 0 0111 1101 1110

add exponent and adjust the bias: 0111 1110 + 0111 1101 - 0111 1111

= 0111 1100

multiply 1.0101 and 1.1110 10.0111 0110

normalize and detect underflow or overflow:exponent: 0111 1101, mantissa: 1.0011 1011 0

round the product and the result is0 0111 1101 0100

compute

(1.0101 x 2-1 ) x (1.1110 x 2-2) = .65625 x .46875 =0.3076171875

= 10.01110110 x 2-3

= 1.001110110 x 2-2 = .3076171875

= 0 0111 1101 0100 = .3125

IEEE 754 Floating Point

Numbers

Special Representation

When exponent is 1111 1111 or 0000 0000

Type Exponent Fraction

Zeroes 0 0

De-normalized

numbers

0 non zero

Normalized

numbers

1 to 2e − 2 any

Infinities 2e − 1 0

NaNs 2e − 1 non zero

IEEE 754 FP Representation

There are two Zeroes

+0 (s is 0) and −0 (s is 1)

There are two Infinities

+∞ (s is 0) and −∞ (s is 1)

Numbers closest to zero

is a denormalized value

all 0s in the Exp field and the binary value 1 in the Fraction field

±2−149 ≈ ±1.4012985×10−45

Normalized numbers closest to zero binary value 1 in the Exp field and 0 in the fraction field)

±2−126 ≈ ±1.175494351×10−38

Finite numbers furthest from zero254 in the Exp field and all 1s in the fraction field

±(1-2-24)×2128[2] ≈ ±3.4028235×1038

Inaccuracy in FPs

Exponent Minimum Maximum Precision

0 1 1.999999880791 1.19209289551e-7

1 2 3.99999976158 2.38418579102e-7

2 4 7.99999952316 4.76837158203e-7

10 1024 2047.99987793 1.220703125e-4

11 2048 4095.99975586 2.44140625e-4

23 8388608 16777215 1

24 16777216 33554430 2

127 1.7014e38 3.4028e38 2.02824096037e31

• Double precision arithmetic

• Double word – 64-bits

• 11 bit exponent, 52 bit mantissa

FP numbers

Limited precision:(- 1.5 x 1038 + 1.5 x 1038 ) + 1 = 0 + 1 = 1

-1.5 x 1038 + (1.5 x 1038 + 1) = -1.5 x 1038 + 1.5 x 1038 = 0

A view of FP representation, (assume 4 bit precision)

There is a big gap between two large consecutive

numbers:

gap = 2-23 x 2e-127

Add de-normal numbers (exponent=0, mantissa >0) to fill

the gap between 0 and 2-126

0 2-126 2-1242-125

de-normal numbers

The Patriot Missile Failure

On February 25, 1991, during

the Gulf War, an American

Patriot Missile battery in Dharan,

Saudi Arabia, failed to track and

intercept an incoming Iraqi Scud

missile. The Scud struck an

American Army barracks, killing

28 soldiers and injuring around

100 other peoplehttp://www.ima.umn.edu/~arnold/disasters/patriot.html

http://www.ima.umn.edu/~arnold/disasters/patriot.html

The Patriot Missile Failure - I

Computed time on 1/10th of seconds

In 24-bit precision arithmetic

Truncation instead of round-off

1/10 = 0.0001100110011001100110011001100

Patriot Stored

0.00011001100110011001100

Error

0.0000000000000000000000011001100

= 0.000000095 decimal

The Patriot Missile Failure

Patrioit battery was up for more than 100

hours

Accumulated Error:

0.000000095×100×60×60×10=0.34 1/10ths of a

second

Speed of scud is 1,676 meters per second,

and so travels more than half a kilometer in

this time

Completely out of range

Explosion of the Ariane 5

On June 4, 1996 an unmanned Ariane 5 rocket launched by the

European Space Agency exploded just forty seconds after its lift-off

from Kourou, French Guiana. The rocket was on its first voyage,

after a decade of development costing $7 billion. The destroyed

rocket and its cargo were valued at $500 million. A board of inquiry

investigated the causes of the explosion and in two weeks issued a

report. It turned out that the cause of the failure was a software

error in the inertial reference system. Specifically a 64 bit floating

point number relating to the horizontal velocity of the rocket with

respect to the platform was converted to a 16 bit signed integer.

The number was larger than 32,767, the largest integer storeable in

a 16 bit signed integer, and thus the conversion failed.

The Number Agenda

Unsigned Numbers

Representation,

Addition, Subtraction

Multiplication, Division

Signed Numbers

Representation


Multiplication, Division

Floating Point Numbers (For large and fractions)

Representation


Multiplication

(d31 d30 … d2 d1 d0)2 = d31*231 + d30*2

30 + … d2*22 + d1* 2

1 + d0*20

(d31 d30 … d2 d1 d0)2 = (-1)*231 + d30*230 + … d2*2

2 + d1* 21 + d0*2

0

(d31 d30 … d2 d1 d0)2 = (-1)*d31 * 1.d22… d0 * 2(d30…d23 - 127)

lecture11 assembly language

Engineering