arithmetic for computers - skkuarcs.skku.edu/.../courses/computerarchitectures/03-arithmetic.pdf ·...
Post on 23-Aug-2018
239 Views
Preview:
TRANSCRIPT
Arithmetic for Computers
Hwansoo Han
Arithmetic for Computers
Operations on integers Addition and subtraction
Multiplication and division
Dealing with overflow
Floating-point real numbers Representation and operations
2
Integer Addition
Example: 7 + 6
Overflow if result out of range Adding positive(+) and negative(–) operands, no overflow
Adding two positive(+) operands
Overflow if result sign is 1
Adding two negative(–) operands
Overflow if result sign is 0
3
Integer Subtraction
Add negation of second operand Example: 7 – 6 = 7 + (–6)
+7: 0000 0000 … 0000 0111
–6: 1111 1111 … 1111 1010
+1: 0000 0000 … 0000 0001
Overflow if result out of range Subtracting two positive(+) or two negative(–) operands,
no overflow
Subtracting positive(+) from negative(–) operand
Overflow if result sign is 0
Subtracting negative(–) from positive(+) operand
Overflow if result sign is 14
Dealing with Overflow
Some languages (e.g., C) ignore overflow Use MIPS addu, addui, subu instructions
Other languages (e.g., Ada, Fortran) require raising an
exception Use MIPS add, addi, sub instructions
On overflow, invoke exception handler
Save PC in exception program counter (EPC) register
Jump to predefined handler address
mfc0 (move from coprocessor register) instruction can
retrieve EPC value, to return after corrective action
5
Arithmetic for Multimedia
Graphics and media processing operates on vectors
of 8-bit and 16-bit data Use 64-bit adder, with partitioned carry chain
Operate on 8×8-bit, 4×16-bit, or 2×32-bit vectors
SIMD (single-instruction, multiple-data)
Saturating operations On overflow, result is largest representable value
vs. 2s-complement modulo arithmetic
E.g., clipping in audio, saturation in video
6
Multiplication
Start with long-multiplication approach
1000× 1001
10000000 0000 1000 1001000
Length of product is the
sum of operand lengths
multiplicand
multiplier
product
7
Multiplication Hardware
Initially 0
8
Optimized Multiplier
Perform steps in parallel: add/shift
One cycle per partial-product addition That’s OK, if frequency of multiplications is low
Multiplier
Shift right
Write
Product
9
Faster Multiplier
Uses multiple adders Cost/performance tradeoff
Can be pipelined Several multiplication performed in parallel
10
MIPS Multiplication
Two 32-bit registers for product (separately provided) HI: most-significant 32 bits
LO: least-significant 32-bits
Instructions mult rs, rt / multu rs, rt
64-bit product in HI/LO
mfhi rd / mflo rd
Move from HI/LO to rd
Can test HI value to see if product overflows 32 bits
mul rd, rs, rt
Least-significant 32 bits of product –> rd
11
Division
Check for 0 divisor
Long division approach If divisor ≤ dividend bits
1 bit in quotient, subtract
Otherwise
0 bit in quotient, bring down next
dividend bit
Restoring division Do the subtract, and if remainder
goes < 0, add divisor back
Signed division Divide using absolute values
Adjust sign of quotient and remainder
as required
10011000 1001010
-100010101 1010-1000
10
n-bit operands yield n-bit
quotient and remainder
quotient
dividend
remainder
divisor
12
Division Hardware
Initially dividend
Initially divisor
in left half
Shift left
Write
divisor
remainder
divisor
dividend
13
Optimized Divider
One cycle per partial-remainder subtraction
Quotient in right half of 64-bit remainder register
Looks a lot like a multiplier! Same hardware can be used for both
14
Faster Division
Cannot use parallel hardware as in multiplier Subtraction is conditional on sign of remainder
Faster dividers (e.g. SRT devision) generate multiple
quotient bits per step Guess quotient bits and correct wrong guesses in
subsequent steps
Still require multiple steps
15
MIPS Division
Use HI/LO registers for result HI: 32-bit remainder
LO: 32-bit quotient
Instructions div rs, rt / divu rs, rt
No overflow or divide-by-0 checking
Software must perform checks if required
Use mfhi, mflo to access result
16
17
Floating Point
We need a way to represent Numbers with fractions, e.g., 3.1416
Very small numbers, e.g., 0.000000001
Very large numbers, e.g., 3.15576 x 109
Like Scientific notation –2.34 × 1056
+0.002 × 10–4
+987.02 × 109
In binary ±1.xxxxxxx2 × 2yyyy
Types float and double in C
18
Floating Point
Representation Sign, exponent, significand: (–1)sign x significand x 2exponent
More bits for significand gives more accuracy
More bits for exponent increases range
IEEE 754 floating point standard: Single precision: 8 bit exponent, 23 bit significand, 1 bit sign
Double precision: 11 bit exponent, 52 bit significand, 1 bit sign
19
Fractional Binary Numbers
Representation Bits to right of “binary point” represent fractional power of 2
Represents rational number:
bi bi–1 b2 b1 b0 b–1 b–2 b–3 b–j• • •• • • .
1
2
4
2i–12i
• • •• • •
1/2
1/4
1/8
2–j
bk 2k
k j
i
20
Examples:
Value Representation
5 + 3/4 101.1122 + 7/8 10.1112
0.111111…2 represents just below 1.0 1/2 + 1/4 + 1/8 + … + 1/2i + … 1.0 (notation 1.0 – )
Representable Numbers Can only exactly represent numbers of the form x/2k
Other numbers have repeating bit representations
Value Representation
1/3 0.0101010101[01]…2
1/5 0.001100110011[0011]…2
1/10 0.0001100110011[0011]…2
Fractional Binary Numbers (cont’d)
21
IEEE 754 FP: Normalized Values
Condition: exponent 000…0 and exponent 111…1
Significand coded with implied leading 1 significand = 1.xxx…x2
Minimum when 000…0 (significand = 1.0)
Maximum when 111…1 (significand = 2.0 - )
Get extra leading bit for free
Exponent is “biased” to make sorting easier exponent: 1 ~ 254
bias: 127 for single precision (1023 for double precision)
format: (–1)sign x (1 + significand) x 2exponent – bias
sign exponent significand
22
IEEE 754 FP: Normalized Values (cont’d)
Example for -0.75 Decimal: -.75 = - ( ½ + ¼ )
Binary: -.11 = -1.1 x 2-1
Floating point exponent = 126 = 01111110 (-1=126-127)
IEEE single precision:
-0.75 =(-1)1 x (1 + .1000…00000000) x 2 126-127
Another example for 15213.0
1521310 = 111011011011012 = 1.11011011011012 x 213
significand = 1101101101101 (implied leading 1)
exponent = 140 = 100011002 (exponent – bias = 13, bias = 127)
0 10001100 11011011011010000000000
bias for single precision
1 01111110 10000000000000000000000
sign exponent significand
23
IEEE 754 FP: Denormalized Values
Condition: exp = 000…0
Value exponent = 1 – bias (e.g. single precision: -126 = 1-127)
significand = 0.xxx…x2 (no implied leading 1)
Cases exp = 000…0, frac = 000…0
Represents value 0
Note that have distinct values +0 and -0 (depending on sign bit)
exp = 000…0, frac 000…0
Numbers very close to 0.0 (-1 x 2-126, 1 x 2-126)
“Gradual underflow”: possible numeric values are spaced evenly near 0.0
0.0 1.0 x 2-126-1.0 x 2-126
normalized normalizeddenormalized
24
IEEE 754 FP: Special Values
Condition: exp = 111…1
exp = 111…1, frac = 000…0 Represents value (infinity)
Operation that overflows
Both positive and negative
e.g. , 1.0/0.0 = -1.0/-0.0 = +,
1.0/-0.0 = -
exp = 111…1, frac 000…0
Not-a-Number (NaN)
Represents case when no numeric value can be determined
e.g., 0.0/0.0, sqrt(-1), -
25
IEEE 754 FP: Summary
special
denormalized
normalized
Single precision Double precision Object represented
Exponent Fraction Exponent Fraction
0 0 0 0 0
0 Nonzero 0 Nonzero denormalized number
1-254 Anything 1-2046 Anything floating-point number
255 0 2047 0 Infinity
255 Nonzero 2047 Nonzero NaN (Not-a-Number)
26
Floating Point Addition
Sketch of FP addition Align numbers to have the same exponent (the larger exponent)
Add two significands with their signs and implied leading 1s
Normalize the sum, checking overflow and underflow
Round the sum and renormalize if necessary
Example for 0.5 + -0.4375 FP format: 0.5 = 1.000 x 2-1, -0.4375 = -1.110 x 2-2
Align to the larger exponent: -1.110 x 2-2 = -0.111 x 2-1
Addition of significands: 1000 + -0111 = 0001
Result: 0.001 x 2-1
Normalize: 1.000 x 2-4
Checking overflow and underflow: -126 ≤ -4 ≤ 127
27
FP Adder Hardware
Step 1
Step 2
Step 3
Step 4
28
Floating-Point Multiplication
Sketch of FP multiplication Add exponents without bias
Multiply two significands
Check for underflow and overflow
Normalize the result
Round and renormalize if necessary
Set the sign bit
Example for (1.000 x 2-1) x (-1.110 x 2-2) Exponents: -1 + (-2) = -3
Significand: 1.000 x 1.110 = 1.110000
Check : -126 ≤ -3 ≤ 127
Normalize: 1.110 x 2-3
Sign bit: -1.110 x 2-3
29
FP Arithmetic Hardware
FP multiplier is of similar complexity to FP adder But uses a multiplier for significands instead of an adder
Operations are somewhat more complicated
In addition to overflow we can have “underflow”
FP arithmetic hardware usually does Addition, subtraction, multiplication, division, reciprocal,
square-root
FP integer conversion
Operations usually takes several cycles Can be pipelined
30
FP Instructions in MIPS
FP hardware is coprocessor 1 Adjunct processor that extends the ISA
Separate FP registers 32 single-precision: $f0, $f1, … $f31
Paired for double-precision: $f0/$f1, $f2/$f3, …
Release 2 of MIPs ISA supports 32 × 64-bit FP registers
FP instructions operate only on FP registers Generally no integer operations on FP data, or vice versa
More registers with minimal code-size impact
31
Floating Point Instructions (MIPS)
Single precision and double precision instructions Addition : add.s, add.d
Subtraction: sub.s, sub.d
Multiplication: mul.s, mul.d
Division: div.s, div.d
Comparison – sets or clear FP condition code bit c.xx.s, c.xx.d (xx can be eq, neq, lt, le, gt, ge)
e.g., c.lt.s $f3, $f4
Branch on FP condition code true or false bc1t, bc1f
e.g., bc1t targetLabel
FP load and store instructions (from/to coprocessor 1)
lwc1, ldc1, swc1, sdc1 (w: 32-bit data, d: 64-bit data)
e.g., ldc1 $f8, 32($sp)
32
Accurate Arithmetic
IEEE Std 754 specifies additional rounding control Extra bits of precision (guard, round, sticky)
Choice of rounding modes
Allows programmer to fine-tune numerical behavior of a
computation
Not all FP units implement all options Most programming languages and FP libraries just use
defaults
Trade-off between hardware complexity, performance,
and market requirements
33
Ariane 5
Ariane 5 tragedy (June 4, 1996) Exploded 37 seconds after liftoff
Satellites worth $500 million
Why? Computed horizontal velocity as floating
point number
Converted to 16-bit integer
Careful analysis of Ariane 4 trajectory
proved 16-bit is enough
Reused a module from 10-year-old software
Overflowed for Ariane 5
No precise specification for the S/W
34
0.10
Gulf War in 1991 US Patriot missile failed to intercept Iraqi Scud missile
28 soldiers were killed
Binary representation of 0.10 Patriot incremented a counter once every 0.10 seconds
Implemented to multiply the counter value by 0.10 to get actual time
24-bit binary representation of 0.10 actually was
0.099999904632568359375
which is off by 0.000000095367431640625 (doesn’t seem like much)
After 100 hours, the time is off by 0.34 seconds
Enough time for a Scud to travel 500 meters!
35
Summary
Computer arithmetic is constrained by limited precision
Bit patterns have no inherent meaning
But standards do exist for number representations
Two’s complement, IEEE 754 floating point
Computer instructions determine meaning of the bit patterns
Performance and accuracy are important
So there are many complexities in real machines
Algorithm choice is important
May lead to hardware optimizations for both space and time
(e.g., multiplication)
36
top related