vlsi arithmetic-lect-5

Post on 02-Apr-2015

96 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

VLSI ArithmeticLecture 5

Prof. Vojin G. Oklobdzija

University of California

http://www.ece.ucdavis.edu/acsel

Review

Lecture 4

Ling’s Adder

Huey Ling, “High-Speed Binary Adder”

IBM Journal of Research and Development, Vol.5, No.3, 1981.

Used in: IBM 3033, IBM S370/168, Amdahl V6, HP etc.

Oklobdzija 2004 Computer Arithmetic 4

Ling’s Derivations

ai bi pi gi ti

0 0 0 0 0

0 1 1 0 1

1 0 1 0 1

1 1 0 1 1

iii CCH 11

iii bag

ai bi

ci

si

ci+1

gi implies Ci+1 which implies Hi+1 , thus: gi= gi Hi+1

iiii CpgC 1define:

11

iiiiii

iiiiiiiii

HpCpCp

CppgpCpCp

1 iiii HpCp

111

11

iiiiii

iiiiiiii

HtHpHg

CpHgCpgC

11 iii HtC

Oklobdzija 2004 Computer Arithmetic 5

Ling’s Derivations

iii CCH 11 iiii CpgC 1From: and

iiiiiiiii CgCCpgCCH 11

iiii HtgH 11 11 iii HtCbecause:

fundamental expansion

Now we need to derive Sum equation

Oklobdzija 2004 Computer Arithmetic 6

Ling Adder

Variation of CLA:

Ling, IBM J. Res. Dev, 5/81

iiii CpgC 1

iii CpS

iii bap

iii bag

iiii HtgH 11

iiiiii HtgHtS 11

iii bat

iii bag

Ling’s equations:

Oklobdzija 2004 Computer Arithmetic 7

Ling Adder

iiii

iiiiii

Cpgg

CpCggC

1

iiii CtgC 1 iiii HtgH 11

Ling’s equation:

see: Doran, IEEE Trans on Comp. Vol 37, No.9 Sept. 1988.

Ling uses different transfer function.Four of those functions have desiredproperties (Ling’s is one of them)

Variation of CLA:

ai bi

ci

si

ci+1

ai-1 bi-1

ci-1

si-1

gi, ti gi-1, ti-1

Hi+1 Hi

Oklobdzija 2004 Computer Arithmetic 8

Ling Adder

inCttttgtttgttgtgC 012301231232334

in

in

CtttgttgtggH

CttttgtttgttgtgH

01201212234

101200121122234

Conventional:

Ling:

Fan-in of 5

Fan-in of 4

Oklobdzija 2004 Computer Arithmetic 9

Advantages of Ling’s Adder• Uniform loading in fan-in and fan-out

• H16 contains 8 terms as compared to G16 that contains 15.

• H16 can be implemented with one level of logic (in ECL), while G16 can not (with 8-way wire-OR).

(Ling’s adder takes full advantage of wired-OR, of special importance when ECL technology is used - his IBM limitation was fan-in of 4 and wire-OR of 8)

Oklobdzija 2004 Computer Arithmetic 10

Ling: Weinberger Notes

Oklobdzija 2004 Computer Arithmetic 11

Ling: Weinberger Notes

Oklobdzija 2004 Computer Arithmetic 12

Ling: Weinberger Notes

Oklobdzija 2004 Computer Arithmetic 13

Advantage of Ling’s Adder

• 32-bit adder used in: IBM 3033, IBM S370/ Model168, Amdahl V6.

• Implements 32-bit addition in 3 levels of logic

• Implements 32-bit AGEN: B+Index+Disp in 4 levels of logic (rather than 6)

• 5 levels of logic for 64-bit adder used in HP processor

Oklobdzija 2004 Computer Arithmetic 14

Implementation of Ling’s Adder in CMOS

(S. Naffziger, “A Subnanosecond 64-b Adder”, ISSCC ‘ 96)

Oklobdzija 2004 Computer Arithmetic 15

S. Naffziger, ISSCC’96

01212234 gttgtggH

11 iii HtC

Oklobdzija 2004 Computer Arithmetic 16

S. Naffziger, ISSCC’96

01212234 gttgtggH

Oklobdzija 2004 Computer Arithmetic 17

S. Naffziger, ISSCC’96

01212234 gttgtggH

Oklobdzija 2004 Computer Arithmetic 18

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 19

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 20

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 21

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 22

S. Naffziger, ISSCC’96

)( 0711711111515161516 gttgtggpHpC

Oklobdzija 2004 Computer Arithmetic 23

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 24

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 25

S. Naffziger, ISSCC’96

Oklobdzija 2004 Computer Arithmetic 26

Ling Adder Critical Path

Oklobdzija 2004 Computer Arithmetic 27

Ling Adder: Circuits

A0

B0

A1 B1A1

B1

A2

B2

A2 B2

CKG3

G4

CK

A3

B3P4

A2 B2

B3A3B1

A0 B0

A1

CK

CK

P

LCH LCL

C1H C0LC1L C0H

SumH

CK

K

G

SumL LCH LCL

C1H C0LC1L C0H

CK

P2

P1

G0

CKLC

G2G1

Oklobdzija 2004 Computer Arithmetic 28

LCS4 – Critical G Path

4b

in1

G3

12b

P4(k,p) or (g,p) G4

C15

32b

C47 C15C31

S63 S48S62

16b

Oklobdzija 2004 Computer Arithmetic 29

LCS4 – Logical Effort Delay

Prefix-4 Ling/Conditional-Sum (Dynamic - Long Carry Path)

Stages Branch LE ParasiticTotal

Branch Total LEPath Effort fo, opt

Effort Delay

(ps)

Parasitic Delay

(ps)

Total Delay

(ps)

Total Delay (FO4)

dg3# (dg3) 4.0 0.98 2.97g4 (NAND2) 2.0 1.11 1.84C15# (GG4) 1.0 1.01 1.80C15 (INV) 1.0 1.00 1.00C47# (LC) 3.0 1.03 3.32C47 (INV) 1.0 1.00 1.00C47#b (INV) 1.0 1.00 1.00C47b (INV) 1.0 1.00 1.00S63# (SUM) 16.0 0.86 1.36S63 (INV) 1.0 1.00 1.00

3.74E+023.84E+02 9.73E-01 7.2701.81 13666

Oklobdzija 2004 Computer Arithmetic 30

Results:

• 0.5u Technology

• Speed: 0.930 nS

• Nominal process, 80C, V=3.3V

See: S. Naffziger, “A Subnanosecond 64-b Adder”, ISSCC ‘ 96

Prefix Addersand

Parallel Prefix Adders

Oklobdzija 2004 Computer Arithmetic 32

from: Ercegovac-Lang

Oklobdzija 2004 Computer Arithmetic 33

Prefix Adders

(g0, p0)

Following recurrence operation is defined:

(g, p)o(g’,p’)=(g+pg’, pp’)

such that:

Gi, Pi =

(gi, pi)o(Gi-1, Pi-1 )

i=0

1 ≤ i ≤ n

ci+1 = Gifor i=0, 1, ….. n

c1 = g0+ p0 cin (g-1, p-1)=(cin,cin)

This operation is associative, but not commutativeIt can also span a range of bits (overlapping and adjacent)

Oklobdzija 2004 Computer Arithmetic 34

from: Ercegovac-Lang

Oklobdzija 2004 Computer Arithmetic 35

Parallel Prefix Adders: variety of possibilitiesfrom: Ercegovac-Lang

Oklobdzija 2004 Computer Arithmetic 36

Pyramid Adder:M. Lehman, “A Comparative Study of Propagation Speed-up Circuits in Binary Arithmetic

Units”, IFIP Congress, Munich, Germany, 1962.

Oklobdzija 2004 Computer Arithmetic 37

Parallel Prefix Adders: variety of possibilitiesfrom: Ercegovac-Lang

Oklobdzija 2004 Computer Arithmetic 38

Parallel Prefix Adders: variety of possibilitiesfrom: Ercegovac-Lang

Oklobdzija 2004 Computer Arithmetic 39

Hybrid BK-KS Adder

Oklobdzija 2004 Computer Arithmetic 40

Parallel Prefix Adders: S. Knowles 1999

operation is associative: h>i≥j≥k

operation is idempotent: h>i≥j≥k

produces carry: cin=0

Oklobdzija 2004 Computer Arithmetic 41

Parallel Prefix Adders: Ladner-Fisher

Exploits associativity, but not idempotency. Produces minimal logical depth

Oklobdzija 2004 Computer Arithmetic 42

Two wires at each level. Uniform, fan-in of two.Large fan-out (of 16; n/2); Large capacitive loading combined with the long wires (in the last stages)

Parallel Prefix Adders: Ladner-Fisher(16,8,4,2,1)

Oklobdzija 2004 Computer Arithmetic 43

Parallel Prefix Adders: Kogge-Stone

Exploits idempotency to limit the fan-out to 1. Dramatic increase in wires. The wire span remains the same as in Ladner-Fisher.

Buffers needed in both cases: K-S, L-F

Oklobdzija 2004 Computer Arithmetic 44

Kogge-Stone Adder

Oklobdzija 2004 Computer Arithmetic 45

Parallel Prefix Adders: Brent-Kung

• Set the fan-out to one

• Avoids explosion of wires (as in K-S)

• Makes no sense in CMOS:– fan-out = 1 limit is arbitrary and extreme– much of the capacitive load is due to wire

(anyway)

• It is more efficient to insert buffers in L-F than to use B-K scheme

Oklobdzija 2004 Computer Arithmetic 46

Brent-Kung Adder

Oklobdzija 2004 Computer Arithmetic 47

Parallel Prefix Adders: Han-Carlson

• Is a hybrid synthesis of L-F and K-S

• Trades increase in logic depth for a reduction in fan-out:– effectively a higher-radix variant of K-S.– others do it similarly by serializing the prefix

computation at the higher fan-out nodes.

• Others, similarly trade the logical depth for reduction of fan-out and wire.

Oklobdzija 2004 Computer Arithmetic 48

Parallel Prefix Adders: variety of possibilitiesfrom: Knowles

bounded by L-F and K-S at ends

Oklobdzija 2004 Computer Arithmetic 49

Parallel Prefix Adders: variety of possibilitiesKnowles 1999

Following rules are used:

• Lateral wires at the jth level span 2j bits

• Lateral fan-out at jth level is power of 2 up to 2j

• Lateral fan-out at the jth level cannot exceed that a the (j+1)th level.

Oklobdzija 2004 Computer Arithmetic 50

Parallel Prefix Adders: variety of possibilitiesKnowles 1999

• The number of minimal depth graphs of this type is given in:

• at 4-bits there is only K-S and L-F, afterwards there are several new possibilities.

Oklobdzija 2004 Computer Arithmetic 51

Parallel Prefix Adders: variety of possibilities

example of a new 32-bit adder [4,4,2,2,1]

Knowles 1999

Oklobdzija 2004 Computer Arithmetic 52

Parallel Prefix Adders: variety of possibilities

Example of a new 32-bit adder [4,4,2,2,1]

Knowles 1999

Oklobdzija 2004 Computer Arithmetic 53

Parallel Prefix Adders: variety of possibilitiesKnowles 1999

• Delay is given in terms of FO4 inverter delay: w.c.(nominal case is 40-50% faster)

• K-S is the fastest• K-S adders are wire limited (requiring 80% more area)• The difference is less than 15% between examined schemes

Oklobdzija 2004 Computer Arithmetic 54

Parallel Prefix Adders: variety of possibilitiesKnowles 1999

Conclusion• Irregular, hybrid schmes

are possible• The speed-up of 15% is

achieved at the cost of large wiring, hence area and power

• Circuits close in speed to K-S are available at significantly lower wiring cost

top related