we already support add, sub, and, or, nor need...
TRANSCRIPT
CS2504, Spring'2007©Dimitris Nikolopoulos
31
MIPS ALU
We already support ADD, SUB, AND, OR, NOR Need support for SLT
SLT modifies only bit 0 of result SLT can be implemented via subtraction (a – b) and a subsequent check of the sign of the result (set if < 0) Can be implemented in the adder already present in the ALU
CS2504, Spring'2007©Dimitris Nikolopoulos
33
MIPS ALU with SLT
If a – b < 0 then bit 31of result will be 1. We canuse a feedback line from bit 31to bit 0, to complete the SLToperation
CS2504, Spring'2007©Dimitris Nikolopoulos
34
MIPS ALU with Conditional Branches
beq with zero canbe checked by simplyoring the bits of theoutput of the ALU
CS2504, Spring'2007©Dimitris Nikolopoulos
35
ALU Control Lines
ALU Control Signal Function0000 AND0001 OR0010 ADD0110 SUB0111 SLT1100 NOR
Bit 0 of ALU control is b negate
Bit 1-3 of ALU control is operation (6 operations need 3 bits)
CS2504, Spring'2007©Dimitris Nikolopoulos
36
Faster adders
Expensive addition:
CarryIn1=b0⋅CarryIn0a0⋅CarryIn0a1⋅b1
c2=b1⋅c1a1⋅c1a1⋅b1
c1=b0⋅c0 a0⋅c0 a0⋅b0
c2=b1⋅b0⋅c0 b1⋅a0⋅c0b1⋅a0⋅b0a1⋅b0⋅c0a1⋅a0⋅c0 a1⋅a0⋅b0 a1⋅b1
CS2504, Spring'2007©Dimitris Nikolopoulos
37
Addition with propagation and generation
Originally:
Propagate and generate terms create recursive formula for next carry out
ci1=bi⋅ciai⋅ciai⋅bi=aibi⋅ciai⋅bi
gi=ai⋅bi
pi=aibi
ci1=pi⋅cigi
CS2504, Spring'2007©Dimitris Nikolopoulos
38
Addition with propagation and generation
Cheaper than two-level logic (ratio of gates 3/7 for c2) but still quite expensive:
Idea unroll the addition (example 16-bit add, using 4-bit carry lookahead adder)s:
Super-propagate signal
c2=g1 p1⋅g0 p1⋅p0⋅c0
c4=g3 p3⋅g2 p3⋅p2⋅g1 p3⋅p2⋅p1⋅g0 p3⋅p2⋅o1⋅p0⋅c0
P0=p3⋅p2⋅p1⋅p0
P1= p7⋅p6⋅p5⋅p4
P2= p11⋅p10⋅p9⋅p8
P3=p15⋅p14⋅p13⋅p12
CS2504, Spring'2007©Dimitris Nikolopoulos
39
Addition with propagation and generation
Super-generate signalG0=g3 p3⋅g2 p3⋅p2⋅g1 p3⋅p2⋅p1⋅g0
G1=g7 p7⋅g6 p7⋅p6⋅g5 p7⋅p6⋅p5⋅g4
G2=g11 p11⋅g10 p11⋅p10⋅g9 p11⋅p10⋅p9⋅g8
G3=g15 p15⋅g14 p15⋅p14⋅g13 p15⋅p14⋅p13⋅g12
C1=G0P0⋅c0
C2=G1P1⋅G0P1⋅P0⋅c0
C3=G2P2⋅G1P2⋅P1⋅G0P2⋅P1⋅P0⋅c0
C4=G3P3⋅G2P3⋅P2⋅G1P3⋅P2⋅P1⋅G0P3⋅P2⋅P1⋅P0⋅c0
CS2504, Spring'2007©Dimitris Nikolopoulos
43
Comparing hardware implementations
Assume each AND or OR gate takes the same time for a signal to pass through.
Count gates along the critical path through the logic.
Compare gate delays
CS2504, Spring'2007©Dimitris Nikolopoulos
44
Comparing hardware implementations
2 gates per bit 32-gate delay
CS2504, Spring'2007©Dimitris Nikolopoulos
45
Comparing hardware implementations
2 gates delay per C (sum of products) 2 gates delay per G (sum of products) 1 gate delay per P (more inputs) 6 time faster than ripple carry
C4=G3P3⋅G2P3⋅P2⋅G1P3⋅P2⋅P1⋅G0P3⋅P2⋅P1⋅P0⋅c0
P0=p3⋅p2⋅p1⋅p0
P1= p7⋅p6⋅p5⋅p4
P2= p11⋅p10⋅p9⋅p8
P3=p15⋅p14⋅p13⋅p12
G0=g3 p3⋅g2 p3⋅p2⋅g1 p3⋅p2⋅p1⋅g0
G1=g7 p7⋅g6 p7⋅p6⋅g5 p7⋅p6⋅p5⋅g4
G2=g11 p11⋅g10 p11⋅p10⋅g9 p11⋅p10⋅p9⋅g8
G3=g15 p15⋅g14 p15⋅p14⋅g13 p15⋅p14⋅p13⋅g12