vlsi arithmetic-lect-5
TRANSCRIPT
![Page 1: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/1.jpg)
VLSI ArithmeticLecture 5
Prof. Vojin G. Oklobdzija
University of California
http://www.ece.ucdavis.edu/acsel
![Page 2: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/2.jpg)
Review
Lecture 4
![Page 3: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/3.jpg)
Ling’s Adder
Huey Ling, “High-Speed Binary Adder”
IBM Journal of Research and Development, Vol.5, No.3, 1981.
Used in: IBM 3033, IBM S370/168, Amdahl V6, HP etc.
![Page 4: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/4.jpg)
Oklobdzija 2004 Computer Arithmetic 4
Ling’s Derivations
ai bi pi gi ti
0 0 0 0 0
0 1 1 0 1
1 0 1 0 1
1 1 0 1 1
iii CCH 11
iii bag
ai bi
ci
si
ci+1
gi implies Ci+1 which implies Hi+1 , thus: gi= gi Hi+1
iiii CpgC 1define:
11
iiiiii
iiiiiiiii
HpCpCp
CppgpCpCp
1 iiii HpCp
111
11
iiiiii
iiiiiiii
HtHpHg
CpHgCpgC
11 iii HtC
![Page 5: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/5.jpg)
Oklobdzija 2004 Computer Arithmetic 5
Ling’s Derivations
iii CCH 11 iiii CpgC 1From: and
iiiiiiiii CgCCpgCCH 11
iiii HtgH 11 11 iii HtCbecause:
fundamental expansion
Now we need to derive Sum equation
![Page 6: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/6.jpg)
Oklobdzija 2004 Computer Arithmetic 6
Ling Adder
Variation of CLA:
Ling, IBM J. Res. Dev, 5/81
iiii CpgC 1
iii CpS
iii bap
iii bag
iiii HtgH 11
iiiiii HtgHtS 11
iii bat
iii bag
Ling’s equations:
![Page 7: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/7.jpg)
Oklobdzija 2004 Computer Arithmetic 7
Ling Adder
iiii
iiiiii
Cpgg
CpCggC
1
iiii CtgC 1 iiii HtgH 11
Ling’s equation:
see: Doran, IEEE Trans on Comp. Vol 37, No.9 Sept. 1988.
Ling uses different transfer function.Four of those functions have desiredproperties (Ling’s is one of them)
Variation of CLA:
ai bi
ci
si
ci+1
ai-1 bi-1
ci-1
si-1
gi, ti gi-1, ti-1
Hi+1 Hi
![Page 8: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/8.jpg)
Oklobdzija 2004 Computer Arithmetic 8
Ling Adder
inCttttgtttgttgtgC 012301231232334
in
in
CtttgttgtggH
CttttgtttgttgtgH
01201212234
101200121122234
Conventional:
Ling:
Fan-in of 5
Fan-in of 4
![Page 9: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/9.jpg)
Oklobdzija 2004 Computer Arithmetic 9
Advantages of Ling’s Adder• Uniform loading in fan-in and fan-out
• H16 contains 8 terms as compared to G16 that contains 15.
• H16 can be implemented with one level of logic (in ECL), while G16 can not (with 8-way wire-OR).
(Ling’s adder takes full advantage of wired-OR, of special importance when ECL technology is used - his IBM limitation was fan-in of 4 and wire-OR of 8)
![Page 10: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/10.jpg)
Oklobdzija 2004 Computer Arithmetic 10
Ling: Weinberger Notes
![Page 11: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/11.jpg)
Oklobdzija 2004 Computer Arithmetic 11
Ling: Weinberger Notes
![Page 12: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/12.jpg)
Oklobdzija 2004 Computer Arithmetic 12
Ling: Weinberger Notes
![Page 13: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/13.jpg)
Oklobdzija 2004 Computer Arithmetic 13
Advantage of Ling’s Adder
• 32-bit adder used in: IBM 3033, IBM S370/ Model168, Amdahl V6.
• Implements 32-bit addition in 3 levels of logic
• Implements 32-bit AGEN: B+Index+Disp in 4 levels of logic (rather than 6)
• 5 levels of logic for 64-bit adder used in HP processor
![Page 14: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/14.jpg)
Oklobdzija 2004 Computer Arithmetic 14
Implementation of Ling’s Adder in CMOS
(S. Naffziger, “A Subnanosecond 64-b Adder”, ISSCC ‘ 96)
![Page 15: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/15.jpg)
Oklobdzija 2004 Computer Arithmetic 15
S. Naffziger, ISSCC’96
01212234 gttgtggH
11 iii HtC
![Page 16: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/16.jpg)
Oklobdzija 2004 Computer Arithmetic 16
S. Naffziger, ISSCC’96
01212234 gttgtggH
![Page 17: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/17.jpg)
Oklobdzija 2004 Computer Arithmetic 17
S. Naffziger, ISSCC’96
01212234 gttgtggH
![Page 18: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/18.jpg)
Oklobdzija 2004 Computer Arithmetic 18
S. Naffziger, ISSCC’96
![Page 19: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/19.jpg)
Oklobdzija 2004 Computer Arithmetic 19
S. Naffziger, ISSCC’96
![Page 20: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/20.jpg)
Oklobdzija 2004 Computer Arithmetic 20
S. Naffziger, ISSCC’96
![Page 21: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/21.jpg)
Oklobdzija 2004 Computer Arithmetic 21
S. Naffziger, ISSCC’96
![Page 22: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/22.jpg)
Oklobdzija 2004 Computer Arithmetic 22
S. Naffziger, ISSCC’96
)( 0711711111515161516 gttgtggpHpC
![Page 23: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/23.jpg)
Oklobdzija 2004 Computer Arithmetic 23
S. Naffziger, ISSCC’96
![Page 24: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/24.jpg)
Oklobdzija 2004 Computer Arithmetic 24
S. Naffziger, ISSCC’96
![Page 25: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/25.jpg)
Oklobdzija 2004 Computer Arithmetic 25
S. Naffziger, ISSCC’96
![Page 26: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/26.jpg)
Oklobdzija 2004 Computer Arithmetic 26
Ling Adder Critical Path
![Page 27: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/27.jpg)
Oklobdzija 2004 Computer Arithmetic 27
Ling Adder: Circuits
A0
B0
A1 B1A1
B1
A2
B2
A2 B2
CKG3
G4
CK
A3
B3P4
A2 B2
B3A3B1
A0 B0
A1
CK
CK
P
LCH LCL
C1H C0LC1L C0H
SumH
CK
K
G
SumL LCH LCL
C1H C0LC1L C0H
CK
P2
P1
G0
CKLC
G2G1
![Page 28: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/28.jpg)
Oklobdzija 2004 Computer Arithmetic 28
LCS4 – Critical G Path
4b
in1
G3
12b
P4(k,p) or (g,p) G4
C15
32b
C47 C15C31
S63 S48S62
16b
![Page 29: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/29.jpg)
Oklobdzija 2004 Computer Arithmetic 29
LCS4 – Logical Effort Delay
Prefix-4 Ling/Conditional-Sum (Dynamic - Long Carry Path)
Stages Branch LE ParasiticTotal
Branch Total LEPath Effort fo, opt
Effort Delay
(ps)
Parasitic Delay
(ps)
Total Delay
(ps)
Total Delay (FO4)
dg3# (dg3) 4.0 0.98 2.97g4 (NAND2) 2.0 1.11 1.84C15# (GG4) 1.0 1.01 1.80C15 (INV) 1.0 1.00 1.00C47# (LC) 3.0 1.03 3.32C47 (INV) 1.0 1.00 1.00C47#b (INV) 1.0 1.00 1.00C47b (INV) 1.0 1.00 1.00S63# (SUM) 16.0 0.86 1.36S63 (INV) 1.0 1.00 1.00
3.74E+023.84E+02 9.73E-01 7.2701.81 13666
![Page 30: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/30.jpg)
Oklobdzija 2004 Computer Arithmetic 30
Results:
• 0.5u Technology
• Speed: 0.930 nS
• Nominal process, 80C, V=3.3V
See: S. Naffziger, “A Subnanosecond 64-b Adder”, ISSCC ‘ 96
![Page 31: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/31.jpg)
Prefix Addersand
Parallel Prefix Adders
![Page 32: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/32.jpg)
Oklobdzija 2004 Computer Arithmetic 32
from: Ercegovac-Lang
![Page 33: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/33.jpg)
Oklobdzija 2004 Computer Arithmetic 33
Prefix Adders
(g0, p0)
Following recurrence operation is defined:
(g, p)o(g’,p’)=(g+pg’, pp’)
such that:
Gi, Pi =
(gi, pi)o(Gi-1, Pi-1 )
i=0
1 ≤ i ≤ n
ci+1 = Gifor i=0, 1, ….. n
c1 = g0+ p0 cin (g-1, p-1)=(cin,cin)
This operation is associative, but not commutativeIt can also span a range of bits (overlapping and adjacent)
![Page 34: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/34.jpg)
Oklobdzija 2004 Computer Arithmetic 34
from: Ercegovac-Lang
![Page 35: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/35.jpg)
Oklobdzija 2004 Computer Arithmetic 35
Parallel Prefix Adders: variety of possibilitiesfrom: Ercegovac-Lang
![Page 36: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/36.jpg)
Oklobdzija 2004 Computer Arithmetic 36
Pyramid Adder:M. Lehman, “A Comparative Study of Propagation Speed-up Circuits in Binary Arithmetic
Units”, IFIP Congress, Munich, Germany, 1962.
![Page 37: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/37.jpg)
Oklobdzija 2004 Computer Arithmetic 37
Parallel Prefix Adders: variety of possibilitiesfrom: Ercegovac-Lang
![Page 38: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/38.jpg)
Oklobdzija 2004 Computer Arithmetic 38
Parallel Prefix Adders: variety of possibilitiesfrom: Ercegovac-Lang
![Page 39: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/39.jpg)
Oklobdzija 2004 Computer Arithmetic 39
Hybrid BK-KS Adder
![Page 40: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/40.jpg)
Oklobdzija 2004 Computer Arithmetic 40
Parallel Prefix Adders: S. Knowles 1999
operation is associative: h>i≥j≥k
operation is idempotent: h>i≥j≥k
produces carry: cin=0
![Page 41: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/41.jpg)
Oklobdzija 2004 Computer Arithmetic 41
Parallel Prefix Adders: Ladner-Fisher
Exploits associativity, but not idempotency. Produces minimal logical depth
![Page 42: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/42.jpg)
Oklobdzija 2004 Computer Arithmetic 42
Two wires at each level. Uniform, fan-in of two.Large fan-out (of 16; n/2); Large capacitive loading combined with the long wires (in the last stages)
Parallel Prefix Adders: Ladner-Fisher(16,8,4,2,1)
![Page 43: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/43.jpg)
Oklobdzija 2004 Computer Arithmetic 43
Parallel Prefix Adders: Kogge-Stone
Exploits idempotency to limit the fan-out to 1. Dramatic increase in wires. The wire span remains the same as in Ladner-Fisher.
Buffers needed in both cases: K-S, L-F
![Page 44: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/44.jpg)
Oklobdzija 2004 Computer Arithmetic 44
Kogge-Stone Adder
![Page 45: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/45.jpg)
Oklobdzija 2004 Computer Arithmetic 45
Parallel Prefix Adders: Brent-Kung
• Set the fan-out to one
• Avoids explosion of wires (as in K-S)
• Makes no sense in CMOS:– fan-out = 1 limit is arbitrary and extreme– much of the capacitive load is due to wire
(anyway)
• It is more efficient to insert buffers in L-F than to use B-K scheme
![Page 46: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/46.jpg)
Oklobdzija 2004 Computer Arithmetic 46
Brent-Kung Adder
![Page 47: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/47.jpg)
Oklobdzija 2004 Computer Arithmetic 47
Parallel Prefix Adders: Han-Carlson
• Is a hybrid synthesis of L-F and K-S
• Trades increase in logic depth for a reduction in fan-out:– effectively a higher-radix variant of K-S.– others do it similarly by serializing the prefix
computation at the higher fan-out nodes.
• Others, similarly trade the logical depth for reduction of fan-out and wire.
![Page 48: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/48.jpg)
Oklobdzija 2004 Computer Arithmetic 48
Parallel Prefix Adders: variety of possibilitiesfrom: Knowles
bounded by L-F and K-S at ends
![Page 49: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/49.jpg)
Oklobdzija 2004 Computer Arithmetic 49
Parallel Prefix Adders: variety of possibilitiesKnowles 1999
Following rules are used:
• Lateral wires at the jth level span 2j bits
• Lateral fan-out at jth level is power of 2 up to 2j
• Lateral fan-out at the jth level cannot exceed that a the (j+1)th level.
![Page 50: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/50.jpg)
Oklobdzija 2004 Computer Arithmetic 50
Parallel Prefix Adders: variety of possibilitiesKnowles 1999
• The number of minimal depth graphs of this type is given in:
• at 4-bits there is only K-S and L-F, afterwards there are several new possibilities.
![Page 51: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/51.jpg)
Oklobdzija 2004 Computer Arithmetic 51
Parallel Prefix Adders: variety of possibilities
example of a new 32-bit adder [4,4,2,2,1]
Knowles 1999
![Page 52: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/52.jpg)
Oklobdzija 2004 Computer Arithmetic 52
Parallel Prefix Adders: variety of possibilities
Example of a new 32-bit adder [4,4,2,2,1]
Knowles 1999
![Page 53: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/53.jpg)
Oklobdzija 2004 Computer Arithmetic 53
Parallel Prefix Adders: variety of possibilitiesKnowles 1999
• Delay is given in terms of FO4 inverter delay: w.c.(nominal case is 40-50% faster)
• K-S is the fastest• K-S adders are wire limited (requiring 80% more area)• The difference is less than 15% between examined schemes
![Page 54: VLSI Arithmetic-Lect-5](https://reader036.vdocument.in/reader036/viewer/2022062318/551df09e49795964198b5153/html5/thumbnails/54.jpg)
Oklobdzija 2004 Computer Arithmetic 54
Parallel Prefix Adders: variety of possibilitiesKnowles 1999
Conclusion• Irregular, hybrid schmes
are possible• The speed-up of 15% is
achieved at the cost of large wiring, hence area and power
• Circuits close in speed to K-S are available at significantly lower wiring cost