lecture11 assembly language
TRANSCRIPT
COMPUTER ORGANIZATION
AND ASSEMBLY LANGUAGE
Lecture 11
Floating Point (Real) Numbers
We need a way to represent
numbers with fractions, e.g., 3.1416
very small numbers, e.g., .000000001
very large numbers, e.g., 3.15576 109
Representation:
sign, exponent, significand: (–1)sign significand 2exponent
more bits for significand gives more accuracy
more bits for exponent increases range
IEEE 754 floating point standard:
single precision: 8 bit exponent, 23 bit significand
S
IEEE 754 floating-point standard
IEEE Floating Point: (-1)S 1.M 2 E-127
Mantissa (sign and magnitude -- S, M)
Exponent (excess 127 binary number -- E)
E M
1 bit 8 bits 23 bits
• Example:– decimal: -.75 = - ( ½ + ¼ )
– binary: -.11 = -1.1 x 2-1
– floating point: exponent = 126 = 01111110
– IEEE single precision:
1 01111110 10000000000000000000000
Remember
For FP Number <SEM>
Value = (-1)S*1.M*2E-127
1st bit is the sign bit
Exponent = E-127
Mantissa is after the decimal
1.1100111 Mantissa is 1100111
IEEE FP Numbers
Example: 0 1000 0011 1101 0000 0000 0000 0000 000
S = 0 (positive mantissa), E = 131 - 127 = 4, and M = .1101
the number = 1.1101 x 24 = 11101 = 29
Example: 1 0111 1011 1101 0000 0000 0000 0000 000
S = 1 (negative mantissa), E = 123 - 127 = -4, and M = .1101
the number = -1.1101 x 2-4 = -0.00011101 = -(2-4+ 2-5+ 2-6+ 2-8)
= -0.11328125
Example: -10.609375
10.609375 = (1010.100111)2 = 1.010100111 x 23
S = 1, E = 3+127 = 130 = 1000 0010,
and M = 010100111 ...
FP: 1 1000 0010 0101 0011 1000 0000 0000 000
0 .609375 x 2
1 .21875 x 2
0 .4375 x 2
0 .875 x 2
1 .75 x 2
1 .5 x 2
1 .0
Example IEEE-decimal
conversion
1 0111 1100 1100 0000 0000 0000 0000 000
First convert each individual field to decimal.
The sign bit S is 1
The E field contains 01111100= 124
The Mantissa (M) is 0.11000…= 0.75
Thus
-1.75 ×2-3 = -0.21875 (? Not verified)
Converting decimal number to
IEEE 754
What is the single-precision representation of 347.625?
First convert the number to binary: 347.625= 101011011.1012.
Normalize the number by shifting the binary point until there is a single 1 to the left:
101011011.101 ×20= 1.01011011101 ×28
The bits to the right of the binary point, 010110111012, comprise the fractional field f.
The number of times you shifted gives the exponent. In this case, the field e should contain 8 + 127 = 135 = 100001112.
The number is positive, so the sign bit is 0.
The final result is:
0 1000 0111 0101 1011 1010 0000 0000 000
FP Addition
Compute 9.999*101 + 1.610*10-1, assuming only 3 places in decimal
1. Make the exponents same• 9.999*101 + 0.01610*101
2. Add the significands• 10.015*101
3. Normalize• 1.0015*102
4. Round-off• 1.002*102
FP Addition
Example
1.1101 x 23 + 1.001 x 2-1
= 1.1101 x 23 + 0.0001001x23 =1.1110001 x 23
Steps: X(m,e) + Y(m,e)
Shift the smaller operand --- align binary point
compute Ye-Xe, right shift Xm Xm’ if Ye > Xe
Add magnitudes (Xm’ + Ym)
Normalize and round-off if necessary
FP Subtraction
0.510 – 0.437510
Step 1: Convert into FP Representation
0.510 = 1.000 x 2-1
-0.437510 = -1.110x2-2
Step 2: Shift significand of smaller number
-1.110x2-2 = -0.111x2-1
Step 3: Subtract the significands
1.000 x 2-1 - 0.111x2-1 = 0.001x2-1
Step 4: Normalize the sum, check for over/underflow
= 0.001x2-1 = 1.000x2-4
Step 5: Round off
1.000x2-4
Step 6: Result
1.000x2-4 = 0.0625
FP Multiply
Compute 1.110*1010 *9.200*10-5 assuming digit significands
1. Add the exponents
• Exponent = 10-5 = 5
2. Multiply the significands
• 1.110*9.200 = 10.212000
3. Normalize
• 1.0212000*106
4. Round-off
• 1.021*106
FP Multiply
Assume 4-bit mantissa
compute 0 0111 1110 0101 x 0 0111 1101 1110
add exponent and adjust the bias: 0111 1110 + 0111 1101 - 0111 1111
= 0111 1100
multiply 1.0101 and 1.1110 10.0111 0110
normalize and detect underflow or overflow:exponent: 0111 1101, mantissa: 1.0011 1011 0
round the product and the result is0 0111 1101 0100
compute
(1.0101 x 2-1 ) x (1.1110 x 2-2) = .65625 x .46875 =0.3076171875
= 10.01110110 x 2-3
= 1.001110110 x 2-2 = .3076171875
= 0 0111 1101 0100 = .3125
IEEE 754 Floating Point
Numbers
Special Representation
When exponent is 1111 1111 or 0000 0000
Type Exponent Fraction
Zeroes 0 0
De-normalized
numbers
0 non zero
Normalized
numbers
1 to 2e − 2 any
Infinities 2e − 1 0
NaNs 2e − 1 non zero
IEEE 754 FP Representation
There are two Zeroes
+0 (s is 0) and −0 (s is 1)
There are two Infinities
+∞ (s is 0) and −∞ (s is 1)
Numbers closest to zero
is a denormalized value
all 0s in the Exp field and the binary value 1 in the Fraction field
±2−149 ≈ ±1.4012985×10−45
Normalized numbers closest to zero binary value 1 in the Exp field and 0 in the fraction field)
±2−126 ≈ ±1.175494351×10−38
Finite numbers furthest from zero254 in the Exp field and all 1s in the fraction field
±(1-2-24)×2128[2] ≈ ±3.4028235×1038
Inaccuracy in FPs
Exponent Minimum Maximum Precision
0 1 1.999999880791 1.19209289551e-7
1 2 3.99999976158 2.38418579102e-7
2 4 7.99999952316 4.76837158203e-7
10 1024 2047.99987793 1.220703125e-4
11 2048 4095.99975586 2.44140625e-4
23 8388608 16777215 1
24 16777216 33554430 2
127 1.7014e38 3.4028e38 2.02824096037e31
• Double precision arithmetic
• Double word – 64-bits
• 11 bit exponent, 52 bit mantissa
FP numbers
Limited precision:(- 1.5 x 1038 + 1.5 x 1038 ) + 1 = 0 + 1 = 1
-1.5 x 1038 + (1.5 x 1038 + 1) = -1.5 x 1038 + 1.5 x 1038 = 0
A view of FP representation, (assume 4 bit precision)
There is a big gap between two large consecutive
numbers:
gap = 2-23 x 2e-127
Add de-normal numbers (exponent=0, mantissa >0) to fill
the gap between 0 and 2-126
0 2-126 2-1242-125
de-normal numbers
The Patriot Missile Failure
On February 25, 1991, during
the Gulf War, an American
Patriot Missile battery in Dharan,
Saudi Arabia, failed to track and
intercept an incoming Iraqi Scud
missile. The Scud struck an
American Army barracks, killing
28 soldiers and injuring around
100 other peoplehttp://www.ima.umn.edu/~arnold/disasters/patriot.html
The Patriot Missile Failure - I
Computed time on 1/10th of seconds
In 24-bit precision arithmetic
Truncation instead of round-off
1/10 = 0.0001100110011001100110011001100
Patriot Stored
0.00011001100110011001100
Error
0.0000000000000000000000011001100
= 0.000000095 decimal
The Patriot Missile Failure
Patrioit battery was up for more than 100
hours
Accumulated Error:
0.000000095×100×60×60×10=0.34 1/10ths of a
second
Speed of scud is 1,676 meters per second,
and so travels more than half a kilometer in
this time
Completely out of range
Explosion of the Ariane 5
On June 4, 1996 an unmanned Ariane 5 rocket launched by the
European Space Agency exploded just forty seconds after its lift-off
from Kourou, French Guiana. The rocket was on its first voyage,
after a decade of development costing $7 billion. The destroyed
rocket and its cargo were valued at $500 million. A board of inquiry
investigated the causes of the explosion and in two weeks issued a
report. It turned out that the cause of the failure was a software
error in the inertial reference system. Specifically a 64 bit floating
point number relating to the horizontal velocity of the rocket with
respect to the platform was converted to a 16 bit signed integer.
The number was larger than 32,767, the largest integer storeable in
a 16 bit signed integer, and thus the conversion failed.
The Number Agenda
Unsigned Numbers
Representation,
Addition, Subtraction
Multiplication, Division
Signed Numbers
Representation
Addition, Subtraction
Multiplication, Division
Floating Point Numbers (For large and fractions)
Representation
Addition, Subtraction
Multiplication
(d31 d30 … d2 d1 d0)2 = d31*231 + d30*2
30 + … d2*22 + d1* 2
1 + d0*20
(d31 d30 … d2 d1 d0)2 = (-1)*231 + d30*230 + … d2*2
2 + d1* 21 + d0*2
0
(d31 d30 … d2 d1 d0)2 = (-1)*d31 * 1.d22… d0 * 2(d30…d23 - 127)