arithmetic for computerscsce.uark.edu/.../lecture8-arithmetic-for-computers.pdf · 2020-02-20 ·...
TRANSCRIPT
Arithmetic for Computers
Alexander Nelson
February 20, 2020
University of Arkansas - Department of Computer Science and Computer Engineering
Arithmetic for Computers
Computers are digital – discrete
i.e. Cannot represent infinite precision
Computers use finite-length registers and arithmetic units
i.e. Must take care of overflow/underflow
Two fundamental concerns:
1. Operations on integers
2. Operations on floating-point real numbers
1
Operations on Integers
Integer Addition
Comparable to base-10 addition formula
Carry overflowing bits
1 (carry)
87
+ 25
====
112
(00110) (carry)
000111
+000110
========
001101
Overflow if result is out of range
2
Integer Subtraction
Add negation of second operand
Example: 7-6 = 7 +(-6)
carry: 1111 1111 1111 110
+7: 0000 0000 0000 0111
-6: 1111 1111 1111 1010
===========================
(1)0000 0000 0000 0001
Overflow?
3
Detecting Overflow
When does overflow occur?
• Add two positive values with resulting sign 1
• Add two negative values with resulting sign 0
• Subtract pos. number from neg. with res. sign 0
• Subtract neg. number from pos. with res. sign 1
Did overflow occur in our subtract example?
No – Subtracted positive number from positive number
4
Dealing with Overflow
Some languages (e.g. C) ignore overflow
i.e. Programmer must deal with it
Compiler does use MIPS addu, addui, subu instructions
Other languages (e.g. Ada, Fortran) require raising an exception
Use MIPS add, addui, sub instructions
On overflow – invoke an exception handler
• Save PC in exception program counter register
• Jump to predefined handler address
• mfc0 – move from coprocessor reg – retrieve EPC value to
return after correction
5
Trapping Overflow w/o Exception
addu $t0, $t1, $t2 # $t0 = sum, but don’t trap
xor $t3, $t1, $t2 # Check if signs differ
slt $t3, $t3, $zero # $t3 == 1 if signs differ
bne $t3, $zero, NO_OF # Branch depending on results
xor $t3, $t0, $t1 # Signs equal, does sign of sum match?
# $t3 negative if sum sign different
slt $t3, $t3, $zero # $t3 == 1 if sum sign different
bne $t3, $zero, OF # If all signs not same, goto overflow
Try it yourself with 6+2 for 4-bit adder arithmetic
6
Arithmetic for Multimedia
Graphics/Media processing operates on vectors of 8-bit and 16-bit
data
• Use 64-bit adder w/ partitioned carry chain
• Operate on 8x8-bit, 4x16-bit, or 2x32-bit vectors
• SIMD (single-instruction, multiple-data)
Saturating Operations
• On overflow, result is largest representable value
• Clipping in audio, saturation in video
7
Multiplication
How do you multiply?
Consider how you multiplied when you were little!
3 x 5 = 3 + 3 + 3 + 3 + 3 = 15
How many operations?
8
Multiplication
Long-Multiplication Approach, how you multiplied next:
13
x 22
=====
26
+260
=====
286
1000
x 1001
=======
1000
0000
0000
+1000
========
1001000
How many operations?
9
Multiplication in Hardware
10
Optimized Multiplier
Perform steps in parallel: add/shift
Once cycle per-partial product addition
OK if frequency of multiplication is low! 11
Faster Multiplier
Use multiple adders!
• Cost/performance tradeoff
Can be pipelined – several multiplications performed in parallel
12
MIPS Multiplication
Two 32-bit registers for product
• HI: Most significant 32-bits
• LO: least-significant 32-bits
Instructions:
• mult rs, rt / multu rs, rt
• 64-bit product in HI/LO
• mfhi rd / mflo rd
• Move from HI/LO to rd
• Can test HI value to see if product overflows 32 bits
• mul rd, rs, rt – least significatn 32 bits of product in rd
13
Division
How do you divide?
1. Check for 0 divisor
2. Long division approach
• If divisor ≤ dividend bits – 1 in quotient, subtract
• Otherwise – O bit in quotient, bring down next dividend bit
3. Restoring division – i.e. HW doesn’t know if it can divide, sotry then restore if wrong
• Do subtraction, and if remainder goes < 0, add divisor back
4. Signed Division
• Divide using absolute values
• Adjust sign of quotient and remainder as required
14
Division
Example:
Divide 74 by 8 = 9.25
1. 1001 < 1001 = 1 in quotient
2. 1000 > 10 = 0 in quotient
3. 1000 > 101 = 0 in quotient
4. 1000 < 1010 = 1 in quotient
5. Remainder = 10
15
Division in Hardware
Divisor reg, ALU, Remainder all
64 bits wide (remainder should
be 65 for carry out)
16
Optimized Divider
By shifting operands and quotient simultaneously with subtraction,
algorithm & hardware can be refined
Combine quotient register with right half of remainder register
17
Faster Division
Cannot use parallel hardware as in multiplier
Subtraction conditional on sign of remainder
Faster dividers (e.g. SRT division) generate multiple quotient bits
per step
• Still requires multiple steps
18
MIPS Division
Use HI/LO registers for result
• HI: 32-bit remainder
• LO: 32-bit quotient
Instructions:
• div rs, rt & divu rs, rt
• No overflow or divide-by-0 checking!
• Software must perform checks if required
• Use mfhi, mflo to access result
19
Signed Division
It is easier to assume absolute values & restore sign
Caveat: Must keep track of remainder!
Division must always satisfy the following:
Dividend = Quotient × Divisior + Remainder
Consider -7÷2 == -3 remainder -1
Or: -7÷2 = -4 remainder +1
Quotient cannot depend on sign of dividend/divisor:
−(x ÷ y) = (−x)÷ y – Must hold
Enforce that dividend and remainder must have the same signs
20
Signed Division
Takeaway:
Correct signed division negates quotient if signs of operands are
opposite
Makes sign of nonzero remainder match the dividend
21
Floating Point
Floating Point
Questions:
How do you represent non-integer values?
How do you represent very small or very large numbers?
Consider scientific notation:
−2.34× 1056 – normalized
0.002× 10−4 – not normalized
987.02× 109 – not normalized
22
Floating Point
Scientific notation in binary
±1.xxxxxxxx2 × 2yyyy
Types – Float/double in C
Floating Point Standard – Defined by IEEE Std. 754-1985
Developed in response to divergence of representations
• Portability for scientific code
Almost universally adopted
Two representations – single precision (32 bits), double precision
(64 bits)
23
IEEE Floating-Point Format
Single: Exponent = 8 bits, Fraction = 23 bits
Double: Exponent = 11 bits, Fraction = 52 bits
x = (−1)s × (1 + Fraction)× 2Exponent−Bias
• S: sign bit (0 = non-negative, 1 = negative)
• Normalize significand: 1.0 ≤ |significand | < 2.0
• Always has leading pre-binary-point 1 bit, no need to represent
explicitly
• Significand is teh Fraction with the 1. restored
• Exponent – Excess representation: actual exponent + Bias
• Ensures exponent is unsigned
• Singe: Bias = 127; Double: Bias = 1203
24
Single Precision
Single: Exponent = 8 bits, Fraction = 23 bits
Double: Exponent = 11 bits, Fraction = 52 bits
x = (−1)s × (1 + Fraction)× 2Exponent−Bias
What is the range of a single precision float?
25
Single Precision Range
Single: Exponent = 8 bits, Fraction = 23 bits
Double: Exponent = 11 bits, Fraction = 52 bits
x = (−1)s × (1 + Fraction)× 2Exponent−Bias
Smallest Value:
• Exponent = 0b00000001 == 1-127 = -126
• Fraction = 0b00000000000000000000000 == significand =
1.0
• Result = ±1.0× 2−126 ≈ ±1.2× 10−38
26
Single Precision Range
Single: Exponent = 8 bits, Fraction = 23 bits
Double: Exponent = 11 bits, Fraction = 52 bits
x = (−1)s × (1 + Fraction)× 2Exponent−Bias
Largest Value:
• Exponent = 0b11111110 == 254-127 = +127
• Fraction = 0b11111111111111111111111 == significand ≈2.0
• Result = ±2.0× 2+127 ≈ ±3.4× 1038
27
Double Precision Range
Single: Exponent = 8 bits, Fraction = 23 bits
Double: Exponent = 11 bits, Fraction = 52 bits
x = (−1)s × (1 + Fraction)× 2Exponent−Bias
Smallest Value:
• Exponent = 0b00000000001 == 1-1023 = -1022
• Fraction = 0b00...0000 == significand = 1.0
• Result = ±1.0× 2−1022 ≈ ±2.2× 10−308
28
Double Precision Range
Single: Exponent = 8 bits, Fraction = 23 bits
Double: Exponent = 11 bits, Fraction = 52 bits
x = (−1)s × (1 + Fraction)× 2Exponent−Bias
Largest Value:
• Exponent = 0b11111111110 == 2046-1023 = +1023
• Fraction = 0b11...1111 == significand ≈ 2.0
• Result = ±2.0× 2+1023 ≈ ±1.8× 10308
29
Floating-Point Precision
Relative Precision
• All fraction bits are significant
• Single: approximately 2−23
• Equivalent to 23× log10 2 ≈ 23× 0.3 ≈ 6-decimal digits of
precision
• Double: approximately 2−52
• Equivalent to 52× log10 2 ≈ 52× 0.3 ≈ 16-decimal digits of
precision
30
Floating Point Example
Represent -0.75
• −0.75 = (−1)1 × 1.12 × 2−1
• S = 1
• Fraction = 1000..002
• Exponent = -1 + Bias
• Single: -1 + 127 = 126 =011111102
• Double: -1 + 1023 = 1022 = 0111111111102
Single: 1011111101000000...000000
Double: 1011111111110100000...000
31
Floating Point Example
Let’s go the other way:
What number is represented by single precision float:
11000000101000000...000000
• S = 1
• Fraction = 01000..002
• Exponent = 100000012 = 129 == 129-127
x = (−1)1 × (1 + 0.012)× 2129−127
x = (−1)× (1.25)× 22
x = −5.0
32
IEEE 754 Encoding
Why are exponent 0 and exponent max special?
Special encodings:
Single Precision Double Precision Object Represented
Exponent Fraction Exponent Fraction
0 0 0 0 0
0 Nonzero 0 Nonzero ±denormalized number
1-254 Anything 1-2046 Anything ±floating-point number
255 0 2047 0 ±infinity
255 Nonzero 2047 Nonzero NaN (Not a Number)
33
Denormal Numbers
Notice that if the exponent is zero, that produces a “denormalized
number”
This means that the “hidden bit” (i.e. the leading 1 in the
fraction), is instead a 0
x = (−1)S × (0 + 0.Fraction)× 2−Bias
Smaller than normal numbers
• Allows for gradual underflow with diminishing precision
34
Infinity & NaNs
Exponent = 111...1, Fraction = 000000
• ± infinity
• Can be used in subsequent calculations, avoiding need for
overflow check
Exponent = 111...1, Fraction 6= 000...0
• Not-a-Number (NaN)
• Indicates illegal or undefined result (e.g. 0.0/0.0)
• Can be used in subsequent calculations
35
Floating Point Arithmetic
Floating-Point Addition (Decimal)
Consider a 4-digit decimal example1:
9.999× 101 + 1.610× 10−1
How do we do this?
1. Align decimal points
• Shift number with smaller exponent
• 9.999× 101 + 0.016× 101
2. Add significands
• 9.999× 101 + 0.016× 101 = 10.015× 101
3. Normalize result, check for over/underflow
• 1.0015× 102
4. Round and renormalize if necessary
• 1.002× 102
1Examples directly from book so you can follow along
36
Floating-Point Addition (Binary)
Consider a 4-digit binary example:
1.000× 2−1 +−1.1102 × 2−2//(0.5 +−0.4375)
How do we do this?
1. Align binary points
• Shift number with smaller exponent
• 1.0002 × 2−1 +−0.1112 × 2−1
2. Add significands
• 1.0002 × 2−1 +−0.1112 × 2−1 = 0.0012 × 2−1
3. Normalize result, check for over/underflow
• 1.0002 × 2−4, with no over/underflow
4. Round and renormalize if necessary
• 1.000× 2−4 (no change) = 0.0625
37
Floating Point Adder Hardware
Much more complex than integer
add/subtraction
Doing it in one clock cycle would take
too long
• Much longer than integer
operations
• Slower clock would penalize all
instructions
FP Adder usually takes several cycles
• Can be pipelined
38
Floating Point Adder Hardware
39
What about Multiplication?
FP multiplier is similar complexity to FP adder
• Uses a multiplier for significands instead of an adder
FP arithmetic hardware usually does
• Addition, subtraction, multiplication, division, reciprocal,
square-root
• FP ↔ integer conversion
40
FP Instructions in MIPS
FP Hardware is coprocessor 1
• Adjunct processor that extends the ISA
Separate FP registers
• 32 single-precision: $f0, $f1, ..., $f31
• Paired for double-precision: $f0/$f1, $f2/$f3, ...
• Release 2 of MIPS ISA supports 32x64-bit FP registers
FP instructions operate only on FP registers
• Programs generally don’t do integer operations on FP data or
vise versa
• More registers with minimal code-size impact
FP load and store instructions
• lwc1, ldc1, swc1, sdc1 41
FP Instructions in MIPS
Single-precision arithmetic
• add.s, sub.s, mul.s, div.s
• e.g. add.s $f0, $f1, $f6
Double-precision arithmetic
• add.d, sub.d, mul.d, div.d
Single and double-precision comparison
• c.xx.s, c.xx.d (xx is eq, lt, le, ...)
• Sets or clears FP condition-code bit
Branch on FP condition code true or false
• bc1t, bc1f – branch on true/fasle
42
FP Example: Fahrenheit to Celsius
C Code:
float f2c (float fahr){
return ((5.0/9.0)*(fahr-32.0));
}
fahr in $f12, result in $f0, literals in global memory space
f2c: lwc1 $f16, const5($gp)
lwc2 $f18, const9($gp)
div.s $f16, $f16, $f18
lwc1 $f18, const32($gp)
sub.s $f18, $f12, $f18
mul.s $$f0, $f16, $f18
jr $ra
43
Accurate Arithmetic
IEEE Std 754 specifies additional rounding control
• Extra bits of precision (guard, round, sticky)
• Choice of rounding modes
• Allows programmer to fine-tune numerical behavior of
computation
Not all FP units implement all options
• Most programming languages and FP libraries just use defaults
Trade-off between hardware complexity, performance, and market
requirements
44
Who cares about FP Accuracy?
Important for scientific code!
What about everyday consumer use?
• Financial interactions on large scales (e.g. stock market
transfers)
• Video games (miss your target because of rounding issues)
Bottom line: The market expects accuracy from computers, and
generally doesn’t understand precision
45