introduction to computer organization and architecture lecture 10 by juthawut chantharamalee jutha...
TRANSCRIPT
Introduction to Computer Organization and Architecture
Lecture 10By Juthawut
Chantharamaleehttp://dusithost.dusit.ac.th/~juthawut_cha/home.htm
Outline Adders Comparators Shifters Multipliers Dividers Floating Point Numbers
2Introduction to Computer Organization and Architecture
Binary Representations of Numbers To find negative numbers
Sign and magnitude: msb = ‘1’ 1’s complement: complement each bit to change sign 2’s complement: 2n – positive number
b2b1b0 Unsigned
Sign and Magnitude
1’s
Complement
2’s Complement
0 1 1 3 +3 +3 +3
0 1 0 2 +2 +2 +2
0 0 1 1 +1 +1 +1
0 0 0 0 +0 +0 +0
1 0 0 4 -0 -3 -4
1 0 1 5 -1 -2 -3
1 1 0 6 -2 -1 -2
1 1 1 7 -3 -0 -1
3Introduction to Computer Organization and Architecture
Single-Bit AdditionHalf Adder Full Adder
A B Cout S
0 0
0 1
1 0
1 1
A B C Cout S
0 0 0
0 0 1
0 1 0
0 1 1
1 0 0
1 0 1
1 1 0
1 1 1
A B
S
Cout
A B
C
S
Cout
out
S
C
out
S
C
44Introduction to Computer Organization and Architecture
Single-Bit AdditionHalf Adder Full Adder
A B Cout S
0 0 0 0
0 1 0 1
1 0 0 1
1 1 1 0
A B C Cout S
0 0 0 0 0
0 0 1 0 1
0 1 0 0 1
0 1 1 1 0
1 0 0 0 1
1 0 1 1 0
1 1 0 1 0
1 1 1 1 1
A B
S
Cout
A B
C
S
Cout
5Introduction to Computer Organization and Architecture
Carry-Ripple Adder Simplest design: cascade full adders
Critical path goes from Cin to Cout Design full adder to have fast carry delay
CinCout
B1A1B2A2B3A3B4A4
S1S2S3S4
C1C2C3
6Introduction to Computer Organization and Architecture
Carry Propagate Adders N-bit adder called CPA
Each sum bit depends on all previous carries How do we compute all these carries quickly?
+
BN...1AN...1
SN...1
CinCout
11111 1111 +0000 0000
A4...1
carries
B4...1
S4...1
CinCout
00000 1111 +0000 1111
CinCout
7Introduction to Computer Organization and Architecture
Propagate and Generate - Define An n-bit adder is just a combinational circuit
si = a XOR b XOR c = aibi’ci’ + ai’bici’ + ai’bi’ci + aibici
ci+1 = MAJ(a,b,c) = aibi + aici + bici
Want to write si in “sum of products” form ci+1 = gi + pici, where gi = aibi, pi = ai + bi
if gi is true, then ci+1 is true, thus carry is generated if pi is true, and if ci is true, ci is propagated
Note that the pi equation can also be written as
pi = ai XOR bi since gi = 1 when ai = bi = 1 (i.e. generate occurs, so propagate is a “don’t care”)
8Introduction to Computer Organization and Architecture
Propagate and Generate - Blocks In the general case, for any j with i<j, j+1<k
ck+1 = Gik + Pikci
Gik = Gj+1,k + Pj+1,k Gij
Pik = PijPj+1,k
Gik equation in words: A carry is generated out of the block consisting of bits
i through k inclusive if it is generated in the high-order part of the block (j+1, k) or it is generated in the low-order (i,j) part of the block and then
propagated through the high part
0:00:00inGCP0:00:00inGCP
9Introduction to Computer Organization and Architecture
Propagate and Generate-Lookahead Recursively apply to eliminate carry terms
ci+1 = gi + pigi-1 + pipi-1gi-2 + …
+ pipi-1…p1g0 + pipi-1…p1p0c0
This is a carry-lookahead adder Note large fan-in of OR gate and last AND gate Too big! Build p’s and g’s in steps
c1 = g0 + p0c0
c2 = G01 + P01c0
Where: G01 = g1 + p1g0, P01 = p1p0
1010Introduction to Computer Organization and Architecture
PG Logic
S1
B1A1
P1G1
G00
S2
B2
P2G2
G01
A2
S3
B3A3
P3G3
G02
S4
B4
P4G4
G03
A4 Cin
G0 P0
1: Bitwise PG logic
2: Block PG logic
3: Sum logicC0C1C2C3
Cout
C4
11Introduction to Computer Organization and Architecture
Carry-Ripple Revisited
S1
B1A1
P1G1
G00
S2
B2
P2G2
G01
A2
S3
B3A3
P3G3
G02
S4
B4
P4G4
G03
A4 Cin
G0 P0
C0C1C2C3
Cout
C4
G03 = G3 + P3 G02
12Introduction to Computer Organization and Architecture
Carry-Skip Adder Carry-ripple is slow through all N stages Carry-skip allows carry to skip over groups of n
bits Decision based on n-bit propagate signal
Cin+
S4:1
P4:1
A4:1 B4:1
+
S8:5
P8:5
A8:5 B8:5
+
S12:9
P12:9
A12:9 B12:9
+
S16:13
P16:13
A16:13 B16:13
CoutC4
1
0
C81
0
C121
0
1
0
13Introduction to Computer Organization and Architecture
Carry-Lookahead Adder Carry-lookahead adder computes G0i for many
bits in parallel. Uses higher-valency cells with more than two
inputs.
Cin+
S4:1
G4:1P4:1
A4:1 B4:1
+
S8:5
G8:5P8:5
A8:5 B8:5
+
S12:9
G12:9P12:9
A12:9 B12:9
+
S16:13
G16:13P16:13
A16:13 B16:13
C4C8C12Cout
14Introduction to Computer Organization and Architecture
Carry-Select Adder Trick for critical paths dependent on late input X
Precompute two possible outputs for X = 0, 1 Select proper output when X arrives
Carry-select adder precomputes n-bit sums For both possible carries into n-bit group
Cin+
A4:1 B4:1
S4:1
C4
+
+
01
A8:5 B8:5
S8:5
C8
+
+
01
A12:9 B12:9
S12:9
C12
+
+
01
A16:13 B16:13
S16:13
Cout
0
1
0
1
0
1
15Introduction to Computer Organization and Architecture
Comparators 0’s detector: A = 00…000 1’s detector: A = 11…111 Equality comparator: A = B Magnitude comparator: A < B
16Introduction to Computer Organization and Architecture
1’s & 0’s Detectors 1’s detector: N-input AND gate 0’s detector: NOTs + 1’s detector (N-input NOR)
A0
A1
A2
A3
A4
A5
A6
A7
allones
A0
A1
A2
A3
allzeros
allones
A1
A2
A3
A4
A5
A6
A7
A0
17Introduction to Computer Organization and Architecture
Equality Comparator Check if each bit is equal (XNOR, aka equality
gate) 1’s detect on bitwise equality
A[0]B[0]
A = B
A[1]B[1]
A[2]B[2]
A[3]B[3]
18Introduction to Computer Organization and Architecture
Magnitude Comparator Compute B-A and look at sign B-A = B + ~A + 1 For unsigned numbers, carry out is sign bit
19Introduction to Computer Organization and Architecture
Signed vs. Unsigned For signed numbers, comparison is harder
C: carry out Z: zero (all bits of A-B are 0) N: negative (MSB of result) V: overflow (inputs had different signs, output sign B)
20Introduction to Computer Organization and Architecture
Shifters Logical Shift:
Shifts number left or right and fills with 0’s 1011 LSR 1 = ____ 1011 LSL1 = ____
Arithmetic Shift: Shifts number left or right. Rt shift sign extends
1011 ASR1 = ____ 1011 ASL1 = ____
Rotate: Shifts number left or right and fills with lost bits
1011 ROR1 = ____ 1011 ROL1 = ____
21Introduction to Computer Organization and Architecture
Shifters Logical Shift:
Shifts number left or right and fills with 0’s 1011 LSR 1 = 0101 1011 LSL1 = 0110
Arithmetic Shift: Shifts number left or right. Rt shift sign extends
1011 ASR1 = 1101 1011 ASL1 = 0110
Rotate: Shifts number left or right and fills with lost bits
1011 ROR1 = 1101 1011 ROL1 = 0111
22Introduction to Computer Organization and Architecture
Funnel Shifter A funnel shifter can do all six types of shifts Selects N-bit field Y from 2N-bit input
Shift by k bits (0 k < N)
B C
offsetoffset + N-1
0N-12N-1
Y
23Introduction to Computer Organization and Architecture
Funnel Shifter Operation
Computing N-k requires an adder
24Introduction to Computer Organization and Architecture
Funnel Shifter Operation
Computing N-k requires an adder
25Introduction to Computer Organization and Architecture
Funnel Shifter Operation
Computing N-k requires an adder
26Introduction to Computer Organization and Architecture
Funnel Shifter Operation
Computing N-k requires an adder
27Introduction to Computer Organization and Architecture
Funnel Shifter Operation
Computing N-k requires an adder
28Introduction to Computer Organization and Architecture
Simplified Funnel Shifter Optimize down to 2N-1 bit input
29Introduction to Computer Organization and Architecture
Simplified Funnel Shifter Optimize down to 2N-1 bit input
30Introduction to Computer Organization and Architecture
Simplified Funnel Shifter Optimize down to 2N-1 bit input
31Introduction to Computer Organization and Architecture
Simplified Funnel Shifter Optimize down to 2N-1 bit input
32Introduction to Computer Organization and Architecture
Simplified Funnel Shifter Optimize down to 2N-1 bit input
33Introduction to Computer Organization and Architecture
Funnel Shifter Design 1 N N-input multiplexers
Use 1-of-N hot select signals for shift amount
k[1:0]
s0s1s2s3Y3
Y2
Y1
Y0
Z0Z1Z2Z3Z4
Z5
Z6
left Inverters & Decoder
34Introduction to Computer Organization and Architecture
Funnel Shifter Design 2 Log N stages of 2-input muxes
No select decoding needed
Y3
Y2
Y1
Y0Z0
Z1
Z2
Z3
Z4
Z5
Z6
k0k1
left
35Introduction to Computer Organization and Architecture
Multi-input Adders Suppose we want to add k N-bit words
Ex: 0001 + 0111 + 1101 + 0010 = _____
36Introduction to Computer Organization and Architecture
Multi-input Adders Suppose we want to add k N-bit words
Ex: 0001 + 0111 + 1101 + 0010 = 10111
37Introduction to Computer Organization and Architecture
Multi-input Adders Suppose we want to add k N-bit words
Ex: 0001 + 0111 + 1101 + 0010 = 10111
Straightforward solution: k-1 N-input CPAs Large and slow
+
+
0001 0111
+
1101 0010
10101
10111
38Introduction to Computer Organization and Architecture
Carry Save Addition A full adder sums 3 inputs and produces 2 outputs
Carry output has twice weight of sum output
N full adders in parallel are called carry save adder Produce N sums and N carry outs
Z4Y4X4
S4C4
Z3Y3X3
S3C3
Z2Y2X2
S2C2
Z1Y1X1
S1C1
XN...1 YN...1 ZN...1
SN...1CN...1
n-bit CSA
39Introduction to Computer Organization and Architecture
CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
0001 0111+1101 10110101_
XYZSC
0101_ 1011 +0010
XYZSC
ABS
40Introduction to Computer Organization and Architecture
CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
01010_ 00011
0001 0111+1101 10110101_
XYZSC
0101_ 1011 +0010 0001101010_
XYZSC
01010_+ 00011
ABS
41Introduction to Computer Organization and Architecture
CSA Application Use k-2 stages of CSAs
Keep result in carry-save redundant form
Final CPA computes actual result
4-bit CSA
5-bit CSA
0001 0111 1101 0010
+
10110101_
01010_ 00011
0001 0111+1101 10110101_
XYZSC
0101_ 1011 +0010 0001101010_
XYZSC
01010_+ 00011 10111
ABS
10111
42Introduction to Computer Organization and Architecture
Multiplication Example: 1100 : 1210
0101 : 510
43Introduction to Computer Organization and Architecture
Multiplication Example: 1100 : 1210
0101 : 510 1100
44Introduction to Computer Organization and Architecture
Multiplication Example: 1100 : 1210
0101 : 510 1100 0000
45Introduction to Computer Organization and Architecture
Multiplication Example: 1100 : 1210
0101 : 510 1100 0000 1100
46Introduction to Computer Organization and Architecture
Multiplication Example: 1100 : 1210
0101 : 510 1100 0000 1100 0000
47Introduction to Computer Organization and Architecture
Multiplication Example: 1100 : 1210
0101 : 510 1100 0000 1100 000000111100 : 6010
48Introduction to Computer Organization and Architecture
Multiplication Example:
M x N-bit multiplication Produce N M-bit partial products Sum these to produce M+N-bit product
1100 : 1210 0101 : 510 1100 0000 1100 000000111100 : 6010
multiplier
multiplicand
partialproducts
product
49Introduction to Computer Organization and Architecture
General Form Multiplicand: Y = (yM-1, yM-2, …, y1, y0)
Multiplier: X = (xN-1, xN-2, …, x1, x0)
Product:
1 1 1 1
0 0 0 0
2 2 2M N N M
j i i jj i i j
j i i j
P y x x y
x0y5 x0y4 x0y3 x0y2 x0y1 x0y0
y5 y4 y3 y2 y1 y0
x5 x4 x3 x2 x1 x0
x1y5 x1y4 x1y3 x1y2 x1y1 x1y0
x2y5 x2y4 x2y3 x2y2 x2y1 x2y0
x3y5 x3y4 x3y3 x3y2 x3y1 x3y0
x4y5 x4y4 x4y3 x4y2 x4y1 x4y0
x5y5 x5y4 x5y3 x5y2 x5y1 x5y0
p0p1p2p3p4p5p6p7p8p9p10p11
multiplier
multiplicand
partialproducts
product
50Introduction to Computer Organization and Architecture
Dot Diagram Each dot represents a bit
partial products
multiplier x
x0
x15
51Introduction to Computer Organization and Architecture
Array Multipliery0y1y2y3
x0
x1
x2
x3
p0p1p2p3p4p5p6p7
B
ASin Cin
SoutCout
BA
CinCout
Sout
Sin
=
CSAArray
CPA
critical path BA
Sout
Cout CinCout
Sout
=Cin
BA
52Introduction to Computer Organization and Architecture
Rectangular Array Squash array to fit rectangular floorplan
y0y1y2y3
x0
x1
x2
x3
p0
p1
p2
p3
p4p5p6p7
53Introduction to Computer Organization and Architecture
Fewer Partial Products Array multiplier requires N partial products If we looked at groups of r bits, we could form
N/r partial products. Faster and smaller? Called radix-2r encoding
Ex: r = 2: look at pairs of bits Form partial products of 0, Y, 2Y, 3Y First three are easy, but 3Y requires adder
54Introduction to Computer Organization and Architecture
Booth Encoding Instead of 3Y, try –Y, then increment next partial
product to add 4Y Similarly, for 2Y, try –2Y + 4Y in next partial product
55Introduction to Computer Organization and Architecture
Booth Hardware Booth encoder generates control lines for each PP
Booth selectors choose PP bits
56Introduction to Computer Organization and Architecture
Sign Extension Partial products can be negative
Require sign extension, which is cumbersome High fanout on most significant bit
multiplier x
x0
x15
0
00
x-1
x16x17
ssssssssssssssss
ssssssssssssss
ssssssssssss
ssssssssss
ssssssss
ssssss
ssss
ss
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
57Introduction to Computer Organization and Architecture
Simplified Sign Ext. Sign bits are either all 0’s or all 1’s
Note that all 0’s is all 1’s + 1 in proper column Use this to reduce loading on MSB
s111111111111111s
s1111111111111s
s11111111111s
s111111111s
s1111111s
s11111s
s111s
s1s
PP0
PP1
PP2
PP3
PP4
PP5
PP6
PP7
PP8
58Introduction to Computer Organization and Architecture
Even Simpler Sign Ext. No need to add all the 1’s in hardware
Precompute the answer!
ssss
ss1
ss1
ss1
ss1
ss1
ss1
ss
PP0PP1PP2PP3PP4PP5PP6PP7PP8
59Introduction to Computer Organization and Architecture
Division - Restoring n times
Shift A and Q left one bit
Subtract M from A, put answer in A
If the sign of A is 1 set q0 to 0
Add M back to A If the sign of A is 0
set q0 to 1
Introduction to Computer Organization and Architecture 60
qn 1-
mn 1-
-bit
Divisor M
Controlsequencer
Dividend Q
Shift left
adder
an 1- a0 q0
m0
a n
0
Add/Subtract
Quotientsetting
n 1+
A
Division – Restoring Example
61
10111
1 1 1 1 1
01111
0
0
0
1
000
0
0
00
0
0
0
00
0
1
00
0
0
0
SubtractShift
Restore
1 00001 0000
1 1
Initially
Subtract
Shift
10111
100001100000000
SubtractShift
Restore
101110100010000
1 1
QuotientRemainder
Shift
10111
1 0000
Subtract
Second cycle
First cycle
Third cycle
Fourth cycle
00
0
0
00
1
0
1
10000
1 11 0000
11111Restore
q0Set
q0Set
q0Set
q0Set
1 01
111 1
01
0001
Division - Nonrestoring n times
If the sign of A is 0 shift A and Q left subtract M from A
Else shift A and Q left add M to A
Now if sign of A is 0 set q0 to 1
Else set q0 to 0
If the sign of A is 1 add M to A
Introduction to Computer Organization and Architecture 62
qn 1-
mn 1-
-bit
Divisor M
Controlsequencer
Dividend Q
Shift left
adder
an 1- a0 q0
m0
a n
0
Add/Subtract
Quotientsetting
n 1+
A
Division – Nonrestoring Example
63
1
Add
Quotient
Remainder
0 0 0 01
0 0 1 01 1 1 1 1
1 1 1 1 10 0 0 1 1
0 0 0 01 1 1 1 1
Shift 0 0 01100001111
Add
0 0 0 1 10 0 0 0 1 0 0 01 1 1 0 1
ShiftSubtract
Initially 0 0 0 0 0 1 0 0 0
1 1 1 0 0000
1 1 1 0 00 0 0 1 1
0 0 0ShiftAdd
0 0 10 0 0 011 1 1 0 1
ShiftSubtract
0 0 0 110000
Restore remainder
Fourth cycle
Third cycle
Second cycle
First cycle
q0Set
q0Set
q0Set
q0Set
1 01
111 1
01
0001
Floating Point – Single Precision IEEE-754, 854 Decimal point can
“move” – hence it’s “floating”
Floating point is useful for scientific calculations
Can represent Very large integers and Very small fractions ~10±38
Introduction to Computer Organization and Architecture 64
Sign ofnumber :
32 bits
mantissa fraction23-bit
representationexcess-127exponent in8-bit signed
Value represented
0 0 1 0 1 0 . . . 00 0 0 1 0 1 0 0 0
S M
Value represented
(a) Single precision
(b) Example of a single-precision number
E
+
1.0010100 287-
=
1. M 2E 127-
=
0 signifies-1 signifies
Floating Point – Double Precision Double Precision can
represent ~10±308
Introduction to Computer Organization and Architecture 65
52-bitmantissa fraction
11-bit excess-1023exponent
64 bits
Sign
S M
Value represented 1. M 2E 1023-
=
E
Floating Point The IEEE Standard
requires these operations, at a minimum
Add Subtract Multiply Divide Remainder Square Root Decimal/Binary Conversion
Special Values
Exceptions Underflow, Overflow,
divide by 0, inexact, invalid
Introduction to Computer Organization and Architecture 66
E’ M Value
0 0 +/- 0
255 0 +/- ∞
0 ≠ 0 ±0.M X 2 -126
255 ≠ 0 Not a Number NaN
FP Arithmetic Operations Add/Subtract
Shift mantissa of smaller exponent number right by the difference in exponents
Set the exponent of the result = the larger exponent Add/Sub Mantissas, get sign Normalize
MultiplyDivide Add/Sub exponents, Subtract/Add 127 Multiply/Divide Mantissas, determine sign Normalize
Introduction to Computer Organization and Architecture 67
FP Guard Bits and Truncation Guard bits
Extra bits during intermediate steps to yield maximum accuracy in the final result
They need to be removed when generating the final result Chopping
simply remove guard bits
Von Neumann rounding if all guard bits 0, chop, else 1
Rounding Add 1 to LSB if guard MSB = 1
Introduction to Computer Organization and Architecture 68
FP Add-Subtract Unit
Introduction to Computer Organization and Architecture 69
E X
Magnitude M
with larger EM of number
with smaller EM of number
subtractor8-bit
sign
subtractor8-bit
MUX
Mantissa
SHIFTER
SWAP
detector
Normalize andround
Leading zeros
to right
adder/subtractor
SubtractAdd /
Sign
Add/Sub
n bitsS A S B
E A E B M A M B
n E A E B -=
E A E B
S R
E X-
E R M RR :32-bitresultR A B+=
32-bit operandsA : S A E A M AB : S B E B M B
CombinationalCONTROL
network