design and implementation of 32 bit unsigned …

111

Int. J. Elec&Electr.Eng&Telecoms. 2014 Veersh B Jalihal and Naseeruddin, 2014

DESIGN AND IMPLEMENTATION OF 32 BITUNSIGNED MULTIPLIER USING CLAA, CSLA, ETA

Veersh B Jalihal1* and Naseeruddin1

*Corresponding Author: Veersh B Jalihal,[email protected]

This project deals with the comparison of the VLSI design of the Carry Look-Ahead Adder (CLAA)based 32-bit unsigned integer multiplier and the VLSI design of the carry select adder (CSLA)based 32-bit unsigned integer multiplier. Both the VLSI design of multiplier mUltiplies two 32-bitunsigned integer values and gives a product term of 64-bit values. The CLAA based multiplieruses the delay time of 99 ns for performing mUltiplication operation where as in CSLA basedmultiplier also uses nearly the same delay time for mUltiplication operation. But the area neededfor CLAA multiplier is reduced to 31% by the CSLA based multiplier to complete the multiplicationoperation. These multipliers are implemented using Altera Quartus II and timing diagrams areviewed through avan waves. ETA is able to ease the strict restriction on accuracy, and at thesame time achieve tremendous improvements in both the power consumption and speedperformance.

Keywords: CLAA, CSLA, Delay, Area, Array multiplier, VHDL modeling, Simulation

INTRODUCTIONDigital computer arithmetic is an aspect oflogic design with the objective of developingappropriate algorithms in order to achieve anefficient utilization of the available hardware.The basic operations are addition, subtraction,multiplication and division. In this, we are goingto deal with the operation of additionsimplemented to the operation of multiplication.The repeated form of the addition operationsand shifting results in the multiplicationoperations.

ISSN 2319 – 2518 www.ijeetc.comVol. 3, No. 3, July 2014

© 2014 IJEETC. All Rights Reserved

Int. J. Elec&Electr.Eng&Telecoms. 2014

1 Department of ECE, BITM Bellary, Karnataka, India.

Given that the hardware can only perform arelatively simple and primitive set of Booleanoperations, arithmetic operations are basedon a hierarchy of operations that are built uponthe simple ones. In VLSI designs, speed,power and chip area are the most often usedmeasures for determining the performanceand efficiency of the VLSI architecture.

Multiplications and additions are mostwidely and more often used arithmeticcomputations performed in all digital signalprocessing applications. Addition is a

Research Paper

112


fundamental operation for any digitalmultiplication. A fast, area efficient andaccurate operation of a digital system is greatlyinfluenced by the performance of the residentadders.

Adders are also very important componentin digital systems because of their extensiveuse in these systems.

In this project we are going to compare theperformance of different adders implementedto the multipliers based on area and timeneeded for calculation. On comparison with theCarry Look-Ahead Adder (CLAA) basedmultiplier the area of calculation of the carryselect adder (CSLA) based multiplier issmaller and better with nearly same delay time.Here we are dealing with the comparison inthe bit range of n*n (32*32) as input and 2 n(64) bit output.

Hence, to design a better architecture thebasic adder blocks must have reduced delaytime consumption and area efficientarchitectures. The demand is of DSP stylesystems for both less delay time and less arearequirement for designing the systems. Ourinterest is in the basic building blocks ofarithmetic circuits that dominate in DSPapplications, VLSI architectures, computerapplications and where ever reduced areacomputation is needed.

CARRY LOOK-AHEAD ADDERCarry look-ahead adder can produce carriesfaster due to parallel generation of the carrybits by using additional circuitry. This techniqueuses calculation of carry signals in advance,based on input signals. The result is reducedcarry propagation time. For example, rippleadders are slower but use the least energy.

Let Gi is the carry generate function and Pibe the carry propagate function. Then we canrewrite the carry function as follows:

Gi = Ai·Bi ...(1)

Pi = (Ai xor Bi) ...(2)

Si = Pi xor Ci ...(3)

Ci + l = Gi + Pi·Ci ...(4)

Thus, for 4-bit adder, we can compute thecarry for all the stages as shown below:

C1 = GO + PO·CO ...(5)

C2 = G1 + P1·C1 = G1 + P1·GO +P1·PO·CO ...(6)

C3 = G2 + P2·C2 = G2 + P2·G1 +P2·P1·GO + P2·P1·PO·CO ...(7)

C4 = G3 + P3·C3 = G3 + P3·G2 + P3·P2·G1 + P3·P2·P1·GO + P3·P2·P1·PO·CO

...(8)

In general, we can write:

The sum function:

SUMi – Ai xor Bi xor Ci – Pi xor Ci ...(9)

The carry function:

Ci + l = Gi + Pi.Ci ...(10)

In general, we can write the algorithm as:

If Carry in = 1, then the sum and carry outare given by,

Figure 1: Carry Look-Ahead Adder

113


Sum(i) = a(i) xor b(i) xor ‘1’ ...(11)

Carry(i + 1) = (a(i) and b(i)) or (b(i) or a(i))...(12)

If Carry in = 0, then the sum and carry outare given by,

Sum(i) = a(i) xor b(i) ...(13)

Carry(i + 1) = (a(i) and b(i)) ...(14)

The sum function:

Si = CiSi0 – CiSi

1 ...(15)

The carry function:

Ci-1 = CiCi+10 + CiCi-1

1 ...(16)

MULTIPLTER FOR UNSIGNEDDATAMultiplication involves the generation of partialproducts, one for each digit in the multiplier,as in Figure 3. These partial products are thensummed to produce the final product. Themultiplication of two n-bit binary integersresults in a product of up to 2 n bits in length(Stallings, xxxx).

CARRY SELECTADDERThe concept of CSLA is to compute alternativeresults in parallel and subsequently selectingthe correct result with single or multiple stagehierarchical techniques. In CSLA both sumand carry bits are calculated for two alternativesCin = O and 1. Once Cin is delivered, thecorrect computation is chosen using a mux toproduce the desired output. Instead of waitingfor Cin to calculate the sum, the sum is correctlyoutput as soon as Cin gets there. The timetaken to compute the sum is then avoidedwhich results in good improvement in speed.

Figure 2: A Partial Schematic of theMultiplier

Figure 3: Figure Carry Select Adder

We used the following algorithm toimplement the multiplication operation forunsigned data.

ERROR TOLERANT ADDERThe commonly used terminologies in ErrorTolerant addition are overall error andaccuracy. They are defined by the equationsdiscussed below. Overall Error (OE):

OE = |Rc – Re| ...(17)

where Re is the result obtained by the Errortolerant addition technique, and Rc is thecorrect result (all the results are representedas decimal numbers).

114


Accuracy (ACC): In the case of the errortolerant design, the accuracy of an additionprocess is used to indicate how “correct” theoutput of an adder is for a particular input. Itsvalue ranges from 0-100%.

ACC% = (1 – (OE/Rc)) x 100 ...(18)

In the conventional adder circuit, the delayis mainly due to the carry propagation from theLeast Significant Bit (LSB) to the MostSignificant Bit (MSB). Glitches in the carrypropagation also cause significant powerdissipation.

Therefore, if the carry propagation can beeliminated or curtailed, a great improvementin speed performance and power consumption[8] can be achieved. This new additionarithmetic can be illustrated via an exampleshown below.

In error tolerant addition technique, we firstsplit the input operands into two parts: anaccurate part that includes higher order bitsand the inaccurate part that consists of lowerorder bits. The length of each part need notnecessary be equal. The addition processstarts from the middle, i.e., starting point inFigure 4 towards the two opposite directionsat the same time.

In the example of Figure 4, the two 8-bitinput operands, A = “10110111” (183) and B =“01101101” (109),are divided equally into 4 bitseach for the accurate andinaccurate parts. Theaddition of the higher order bits (accurate part)of the input operands is carried from right toleft(LSB to MSB) and normal addition method isapplied. This is to preserve its correctnesssince the higher order bitsplay a moreimportant role than the lower order bits. Thelower order bits of the input operands (in

accurate part) require a special additionmechanism. No carry signal will beconsideredat any bit position to eliminate the carrypropagation path. To minimize the overall errordue to theelimination of the carry chain, aspecial strategy is adapted, and as follows: 1)check every bit position from left to right (MSBto LSB); 2) if both input bits are “0” or different,normal one-bit addition is performed and theoperation proceeds to next bit position; 3) ifboth input bits are “1”, the checking processstopped and from this bit onward, all sumbitsto the right are set to “1”. The addition

Figure 4: Arithmetic Procedure of ErrorTolerant Adder

Figure 5: Block Diagram of Error TolerantAdder

115


mechanism described can be easilyunderstood from the example. For theadditionof the MSB part in modified boothmultiplication we have adopted this technique.

MULTIPLICATIONALGORITHMLet the product register size be 64 bits. Letthe multiplicand registers size be 32 bits. Storethe multiplier in the least significant half of theproduct register. Clear the most significant halfof the product register.

Repeat the following steps for 32 times:

• If the least significant bit of the productregister is “1” then add the multiplicand tothe most significant half of the productregister.

• Shift the content of the product register onebit to the right (ignore the shifted-out bit).

• Shift-in the carry bit into the most significantbit of the product register. Figure 6 showsa block diagram for such a multiplier.

VHDL SIMULATIONSThe VHDL simulation of the two multipliers ispresented in this section. In this, waveforms,timing diagrams and the design summary forboth the CLAA and CSLA based multipliersare shown in the figures. The VHDL code forboth multipliers, using CLAA and CSLA, aregenerated. The VHDL model has beendeveloped using Altera Quartus II and timing

Figure 6: Multiplier of Two n-bit Values

Figure 7: Waveform for a CLAA BasedMultiplier

Figure 8: Waveform for CSLA BasedMultiplier

Figure 9: Timing Analysis for CLAA BasedMultiplier

Figure 10: Timing Analysis for CSAABased Multiplier

116


diagrams are viewed through avan waves. Themultipliers use two 32-bit values.

Delay AnalysisThe performance analysis for the delay timeof CLAA and CSLA based multipliers arerepresented in the form of the diagram shownin Figure 14.

Figure 11: Design Summary of CLAAMultiplier

Figure 12: Figure Design Summaryof CSLA Multiplier

Under the worst case, the multiplier with acarry look-ahead adder uses time = 98.5 ns,while the multiplier with the carry select adderuses time = 99.5 ns.

PERFORMANCE ANALYSTSArea AnalysisThe performance analysis for the area of CLAAand CSLA based multipliers are representedin the form of the diagram shown in Figure 13.

Figure 13: Figure Area Analysis Chart

Figure 14: Figure Delay Analysis Chart

Area Delay Product AnalysisThe performance analysis for the area delayproduct of CLAA and CSLA based multipliersare represented in the form of the diagramshown in Figure 15.

117


The area needed and delay for both theCLAA and CSLA implemented to the multiplierwas analyzed and the comparison was shownin the figure in the form of a table.

Analysis TableTn this analysis table shown in Table 1, thedelay time is nearly same, the area and thearea delay product of CSLA based multiplieris reduced to 31% when compared to CLAAbased multiplier.

Speed Integrated Circuit HardwareDescription Language, was used to model andsimulate our multiplier. Using CSLA improvesthe overall performance of the multiplier.

Thus a 31% area delay product reductionis possible with the use of the CSLA based32-bit unsigned parallel multiplier than CLAAbased 32-bit unsigned parallel multiplier. Theapplication for ETA are in those areas wherethere is no strict restriction on accuracy or whenhigh speed performance is more importantcompared to accuracy.

FUTURE WORKThis 32-bit multiplier can be further extendedto 64-bit multiplier and 128-bit multiplier usingthe proposed method for multiplicationoperation can be done as future work.

REFERENCES1. Armstrong J R and Gray F G (2000),

VHDL Design Representation andSynthesis, 2nd Edition, Prentice Hall, USA,ISBN: 0-13-021670-4.

2. Asadi P and Navi K (2007), “A NovelHighs-Speed 54-54 Bit Multiplier”, Am. J.Applied Sci., Vol. 4, No. 9, pp. 666-672.

3. Brown S and Vranesic Z (2005),Fundamentals of Digital Logic withVHDL Design, 2nd Edition, McGraw-HillHigher Education, USA, ISBN:0072499389.

4. Hasan Krad and Aws Yousi (2010),“Design and Implementation of a FastUnsigned 32-bit Multiplier Using VHDL”.

5. Meier P C H, Rutenbar R A and Carley LR (1996), “Exploring Multiplier

Figure 15: Figure Area Delay ProductAnalysis Chart

CLAA basedmultiplier 98.5 2957 logic cells 291264.5

CSLA basedmultiplier 99.5 2039 logic cells 202880.5

ETA 95 2021

Table 1: Analysis Table

MultiplierType

Delay(ns) Area Delay Area

Product

CONCLUSIONA design and implementation of a VHDL-based 32-bit unsigned multiplier with CLAAand CSLA was presented. VHDL, a Very High

118


Architecture and Layout for Low Power”,CIC’96.

6. Mohanty P S (2009), “Design andImplementation of Faster and LowPower Multipliers”, Bachelor Thesis,National Insti tute of Technology,Rourkela.

7. Navabi Z (2007), VHDL Modular Designand Synthesis of Cores and Systems,3rd Edition, McGraw-Hill Professional,USA, ISBN: 9780071508926.

8. Sertbas A and Ozbey R S (2004), “APerformance Analysis of Classified BinaryAdder Architectures and the VHDL

Simulations”, J. Elect. Electron. Eng.,Vol. 4, pp. 1025-1030, Istanbul, Turkey.

9. Software Simulation Package: DirectVHDL, Version 1.2 (2007), GreenMounting Computing Systems Inc.,Essex,VT, UK.

10. Stallings W (2006), “ComputerOrganization and Architecture Designingfor Peljormance”, 71h ed., Prentice Hall,Pearson Education International, USA,ISBN: 0-13-185644-8.

11. Wakerly F (2006), Digital Design-Principles and Practices, 4th Edition,Pearson Prentice Hall,USA, ISBN:0131733494.

design and implementation of 32 bit unsigned …

Documents