excuse how to do it now

VLSI Implementation of a High Speed SinglePrecision Floating Point Unit Using VerilogUshasree GECE-VLSI , VIT University, Vellore- 632014, Tamil Nadu, India [email protected] R DhanabalECE-VLSI , VIT University, Vellore- 632014, Tamil Nadu, India [email protected] Dr Sarat kumar sahooECE-VLSI , VIT University, Vellore- 632014, Tamil Nadu, India [email protected] To represent very large or small values, large range is required as the integer representation is no longer appropriate. These values can be represented using the IEEE-754 standard based floating point representation. This paper presents high speed ASIC implementation of a floating point arithmetic unit which can perform addition, subtraction, multiplication, division functions on 32-bit operands that use the IEEE 754-2008 standard. Pre-normalization unit and post normalization units are also discussed along with exceptional handling. All the functions are built by feasible efficient algorithms with several changes incorporated that can improve overall latency, and if pipelined then higher throughput. The algorithms are modeled in Verilog HDL and the RTL code for adder, subtractor, multiplier, divider, square root are synthesized using Cadence RTL complier where the design is targeted for 180nm TSMC technology with proper constraints. Keywords floating point number, normalization, exceptions, latency, overflow, underflow, etc I. INTRODUCTIONAn arithmetic circuit which performs digital arithmeticoperations has many applications in digital coprocessors,application specific circuits, etc. Because of the advancementsin the VLSI technology, many complex algorithms thatappeared impractical to put into practice, have become easilyrealizable today with desired performance parameters so thatnew designs can be incorporated [2]. The standardized methodsto represent floating point numbers have been instituted by theIEEE 754 standard through which the floating point operationscan be carried out efficiently with modest storage requirements,.The three basic components in IEEE 754 standard floatingpoint numbers are the sign, the exponent, and the mantissa [3].The sign bit is of 1 bit where 0 refers to positive number and 1refers to negative number [3]. The mantissa, also calledsignificand which is of 23bits composes of the fraction and aleading digit which represents the precision bits of the number[3] [2]. The exponent with 8 bits represents both positive andnegative exponents. A bias of 127 is added to the exponent toget the stored exponent [2]. Table 1 show the bit ranges forsingle (32-bit) and double (64-bit) precision floating-pointvalues [2]. A floating point number representation is shown intable 2 The value of binary floating point representation is asfollows where S is sign bit, F is fraction bit and E is exponentfield.Value of a floating point number= (-1)S x val (F) x 2val(E)TABLE 1 BIT RANGE FOR SINGLE (32-BIT) AND DOUBLE (64-BIT) PRECISIONFLOATING-POINT VALUESSign Exponent Fraction Bias Singleprecision1[31] 8[30-23] 23[22-00] 127Doubleprecision1[63] 11[62-52] 52[51-00] 1023TABLE 2 FLOATING POINT NUMBER REPRESENTATION32 bits sign exponent mantissa 1 bit 8 bits 23 bitsThere are four types of exceptions that arise during floatingpoint operations. The Overflow exception is raised whenever theresult cannot be represented as a finite value in the precisionformat of the destination [13]. The Underflow exception occurswhen an intermediate result is too small to be calculatedaccurately, or if the operation's result rounded to the destinationprecision is too small to be normalized [13] The Division byzero exception arises when a finite nonzero number is dividedby zero [13]. The Invalid operation exception is raised if thegiven operands are invalid for the operation to be performed[13].In this paper, ASIC implementation of a high speed FPU hasbeen carried out using efficient addition, subtraction,multiplication, division algorithms. Section II depicts thearchitecture of the floating point unit and methodology, to carryout the arithmetic operations. Section III presents the arithmeticoperations that use efficient algorithms with some modificationsto improve latency. Section IV presents the simulation resultsthat have been simulated in Cadence RTL compiler using180nM process. Section V presents the conclusion.II. ARCHITECTURE AND METHODOLOGYThe FPU of a single precision floating point unit that performsadd, subtract, multiply, divide functions is shown in figure 1 [1].Two pre-normalization units for addition/subtraction andmultiplication/division operations has been given [1]. Postnormalization unit also has been given that normalizes themantissa part [2]. The final result can be obtained after postnormalization.To carry out the arithmetic operations, two IEEE-754 format single precision operands are considered. Prenormalizationof the operands is done. Then the selectedoperation is performed followed by post-normalizing the outputobtained .Finally the exceptions occurred are detected andProceedings of 2013 IEEE Conference on Information and Communication Technologies (ICT 2013)978-1-4673-5758-6/13/$31.00 2013 IEEE 803handled using exceptional handling. The executed operationdepends on a three bit control signal (z) which will determinethe arithmetic operation is shown in table 3.Figure 1. Block Diagram of floating point arithmetic unit [1]TABLE 3 FLOATING POINT UNIT OPERATIONSz(control signal) Operation 2b000 Addition2b001 Subtraction2b010 Multiplication2b011 Division2b100 Square rootIII 32 BIT FLOATING POINT ARITHMETIC UNITA. Addition Unit:One of the most complex operations in a floating-point unitcomparing to other functions which provides major delay andalso considerable area. Many algorithms has been developedwhich focused to reduce the overall latency in order to improveperformance. The floating point addition operation is carried outby first checking the zeros, then aligning the significand,followed by adding the two significands using an efficientarchitecture. The obtained result is normalized and is checkedfor exceptions. To add the mantissas, a high speed carry lookahead has been used to obtain high speed. Traditional carry lookahead adder is constructed using AND, XOR and NOT gates.The implemented modified carry look ahead adder uses onlyNAND and NOT gates which decreases the cost of carry lookahead adder and also enhances its speed also [4].The 16 bit modified carry look ahead adder is shown in figure 2and the metamorphosis of partial full adder is shown in figure 3using which, a 24 bit carry look ahead adder has beenconstructed and performed the addition operation.Figure 2. 16 bit modified carry look ahead adder [4]Figure. 3 Metamorphosis of partial full adder [4]B. Subtraction Unit:Subtraction operation is implemented by taking 2scomplement of second operand. Similar to addition operation,subtraction consists of three major tasks pre normalization,addition of mantissas, post normalization and exceptionalhandling. Addition of mantissas is carried out using the 24 bitMCLA shown in figure 2 and figure 3.C. Multiplication Algorithm Constructing an efficient multiplication module is a iterativeprocess and 2n-digit product is obtained from the product of twon-digit operands. In IEEE 754 floating-point multiplication, thetwo mantissas are multiplied, and the two exponents are added.Here first the exponents are added from which the exponent bias(127) is removed. Then mantissas have been multiplied usingfeasible algorithm and the output sign bit is determined byProceedings of 2013 IEEE Conference on Information and Communication Technologies (ICT 2013)978-1-4673-5758-6/13/$31.00 2013 IEEE 804exoring the two input sign bits. The obtained result has beennormalized and checked for exceptions.To multiply the mantissas Bit Pair Recoding (or ModifiedBooth Encoding) algorithm has been used, because of which thenumber of partial products gets reduces by about a factor of two,with no requirement of pre-addition to produce the partialproducts. It recodes the bits by considering three bits at a time.Bit Pair Recoding algorithm increases the efficiency ofmultiplication by pairing. To further increase the efficiency ofthe algorithm and decrease the time complexity, Karatsubaalgorithm can be paired with the bit pair recoding algorithm.One of the fastest multiplication algorithm is Karatsubaalgorithm which reduces the multiplication of two n-digitnumbers to 3nlog32 ~ 3n1.585 single-digit multiplications andtherefore faster than the classical algorithm, which requires n2single-digit products [11]. It allows to compute the product oftwo large numbers x and y using three multiplications of smallernumbers, each with about half as many digits as x or y, withsome additions and digit shifts instead of four multiplications[11]. The steps are carried out as followsLet x and y be represented as n-digit numbers with base B andm

excuse how to do it now

Documents