1. introduction 1.1 detail descriptionusers.encs.concordia.ca/~asim/coen...

34
1. INTRODUCTION This project will describe the design and implementation from ground-up of a simple 4-bit up/down counter using CMOSIS5 technology. 1.1 Detail Description The choice of an up/down counter was made in order to understand issues usually associated with clock distribution and power dissipation in static CMOS circuits. An up/down counter has a good balance between combinatorial and sequential logic building blocks. A 4-bit counter is a fairly small circuit by VLSI standards. The fact that this is a counter allows us to achieve different level of optimization both at the logic/algorithmic level and also at the architecture level. Different schemes of generating the next count will be simulated. Only the most efficient circuit design will be chosen for layout though, as time does not permit to try all the schemes available. Possible areas of investigation: Performance Power / Frequency relationship Area / Power relationship Circuit Implementation / Power /Speed relationship Clock distribution Relative block placement Transistor sizing / Speed relationship Voltage variation / Power / Speed relationship While there are many parameters and data to be gathered from such experiment, we will try to focus on the main aspects of good design and implementation methodologies in general. The project will hence aim at a system with minimum power dissipation, maximum frequency operation and good clocking strategies. 1

Upload: phamdan

Post on 26-Mar-2018

217 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

1. INTRODUCTION This project will describe the design and implementation from ground-up of a simple 4-bit up/down counter using CMOSIS5 technology. 1.1 Detail Description The choice of an up/down counter was made in order to understand issues usually associated with clock distribution and power dissipation in static CMOS circuits. An up/down counter has a good balance between combinatorial and sequential logic building blocks. A 4-bit counter is a fairly small circuit by VLSI standards. The fact that this is a counter allows us to achieve different level of optimization both at the logic/algorithmic level and also at the architecture level. Different schemes of generating the next count will be simulated. Only the most efficient circuit design will be chosen for layout though, as time does not permit to try all the schemes available. Possible areas of investigation:

• Performance Power / Frequency relationship Area / Power relationship Circuit Implementation / Power /Speed relationship

• Clock distribution • Relative block placement • Transistor sizing / Speed relationship • Voltage variation / Power / Speed relationship

While there are many parameters and data to be gathered from such experiment, we will try to focus on the main aspects of good design and implementation methodologies in general. The project will hence aim at a system with minimum power dissipation, maximum frequency operation and good clocking strategies.

1

Page 2: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

1.2 System Requirement The circuit should perform according to a generic up/down 4-bit counter. Such a counter usually has 3 inputs, an asynchronous clear, a clock and an up or down control signal. The result is obtained on a 4-bit output which comes from a register. On each rising clock, the circuit will count up if and only if the up-down signal is at logic high and will count down otherwise. The count will roll-over from “1111” to “0000” when incrementing while it will count from “0000” to “1111” if decrementing. The circuit should be implemented on 0.6um technology. Transistors should be sized accordingly in order to allow symmetric rise time and fall time. Test-benches are to be designed and used to cover all probable situations. The block diagram, state transition table and state diagram for the design are represented below.

Fig. 1 Graphical system description

2

Page 3: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

1.3 Design Analysis The state transition table gives us the next state equations from which we can write the resultant logic functions driving the register. All Minterms from the four columns under “next state” are written into K-Map tables so as to eliminate any redundant terms.

Fig. 2 System Karnaugh-Map

Finally, the compacted next state equations are obtained. The resulting equations describe next states for the system.

DD =0 UDDCCDUDDCDCD ).().(1 +++=

UDCBDBDBCUDCDBDBBCD ).().(2 +++++= UDDCBAACADABUDBCDAADCABAD ).().(3 +++++++=

The A, B, C and D terms are now replaced by their Q0, Q1, Q2, and Q3 terms that represent the current state. After some more minimizations that involve grouping the UD terms and collapsing terms into XOR functions, we get the final solution for the next state.

00 QD = UDQQD ⊕⊕= 101

)...( 100122 QQQUDQUDUDQD ++⊕⊕= ))....()....(.( 100121001233 QQQUDQUDQQQQUDQUDUDQUDUDQD ++++++⊕⊕=

The above table contains the main equations that will be implemented as CMOS combinatorial logic. The above formula also contains two recursive terms. If we ever wanted to increase the depth of the counter, then we can see clearly which terms need to be added or change.

3

Page 4: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

The gates with the largest fan-in are the XOR and the OR, with a maximum of three inputs. The maximum fan-out for the design is three loads.

TABLE OF PRIMITIVE COMPONENTS (CMOS IMPLEMENTATION)

GATE QUATITY COMPLEXITY (TRANSISTORS)

TOTAL (TRANSISTORS)

3-INPUT XOR 3 24 72 3-INPUT OR 2 24 48

2-INPUT AND 6 6 36 D FLIP-FLOP 4 32 128

INVERTER 1 2 2 TOTAL 286

As shown above, the complexity of the circuit, excluding input/output buffers is 286 transistors, placing the design in the Small Scale Integration (SSI) category.

4

Page 5: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

2. DIGITAL LOGIC SYNTHESIS The circuit below maps the logic functions that were obtained earlier onto discrete CMOS gates. It shows that the circuit consists of five types of gates; NAND, OR, XOR, NOT and a 4-bit register. There exists a 3-levels combinatorial delay from the output of the register to its input. This, along with the interconnect delay will form the basis of the maximum frequency of operation.

Fig. 3 Proposed up-down counter circuit

Details about the type of power supply used are hidden. For our process technology, a 3.3V power supply is used.

5

Page 6: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

2.1 NAND Gate Transformation Static CMOS logic should be implemented using NAND gates only as this will provide minimum transistor count, power dissipation and delay. Before implementing the design, we verify if there are any parts of the circuit that can be implemented using NAND gates only. A quick look at the circuit reveals that the structures doing the sum of product (SOP) operation can be efficiently implemented using NAND gates. The following transformation applies:

)()()(

)()()(

)()()(

)()()(

bccabaf

bccabaf

bccabaf

bccabaf

•••••=

•••••=

•+•+•=

•+•+•=

The resulting logic is implemented using two or three inputs NAND gates with a maximum of two logic level delay. The propose circuit possesses similar characteristic as opposed to a conventional SOP implemented using OR and AND gates.

Fig. 4 NAND gate reduction

6

Page 7: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

2.2 Basic Elements The system consists of five main logic elements. Each one is built using CMOS technology and as such, implements the logic function either in a pull-up or pull-down structure with the exception of the XOR which propagates the right logic level.

• Inverter Gate (NOT)

The logic function for a NOT gate is: xf = , where x is the input. The function will be implemented using CMOS technology.

Fig. 5 CMOS Inverter

• 2-inputs NAND (NAND2)

The logic function for an NAND2 is: abf = , where a and b are the inputs. The logic function will be implemented using CMOS technology.

Fig. 6 NAND2 gate

• 3-inputs NAND (NAND3)

The logic function for an NAND3 is: abcf = , where a and b are the inputs. The logic function will be implemented using CMOS technology.

Fig. 7 NAND3 gate

7

Page 8: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

2.3 Complex Elements 2.3.1 3-inputs Exclusive-Or (EXOR3) The logic function for an XOR3 is: abccbacbacbaf +++= , where a, b and c are the inputs. The function will be implemented using CMOS transmission gates.

Fig. 8 Exclusive-Or 3-inputs PTL gate

2.3.2 Data Flip-Flop (DFF) Implementing a register requires a different approach from generic combinatorial logic. Registers can hold a logic state indefinitely and will capture its inputs only on the rising or falling edge of a clock, as required by the design (rising edge for our purpose).

Fig. 9 Data Flip-Flop gate level implementation

The scheme used for the register is based on three SR-Latches. The first latch will memorize information at its input whenever the clock input is low. On the next transition to a high level, the content of the first latch will be transferred to the upper latch. The outputs from both latches will eventually set and reset inputs to a third latch. This latch has the advantage over the conventional transmission gate flip-flop of being built using basic NAND

8

Page 9: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

gates, hence saving on area. The design is eventually more modular and compact. The availability of both true and complementary outputs facilitates system integration and reduces the number of logic gates by one. 2.3.3 4-bit Register (REG4) A register consists of a set of D flip-flops that share a common clock and reset. Together, they will store a word of data. The behavior of the register will hence depend directly on that of its primitive element. The clock being shared among all Flip-Flops will require to be buffered at some stage.

Fig. 10 4-bit register block

9

Page 10: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

3. BASIC INVERTER MODEL The static and dynamic characteristics of the gates used in the design are based on a two transistors inverter model. It is made so as to have approximately equal rise and fall times at the output. The output drive of the inverter is also assumed to be a capacitive load of 25Npf, where N is the fan-out. As such, the PMOS transistor is first sized four times more than the NMOS transistor in order to compensate for the smaller K’ parameter. Then, both transistors are sized once more with similar multiplying factor in order to account for the extra current needed to drive large loads. 3.1 Inverter Static Characteristics:

VDD 3.3V VSS 0V

VIL (max) 0.5V VIH (min) 2.4V

VOL (max) 0.3V VOH (min) 3.0V

NML 0.2V NMH 0.6V

We begin with a minimum sized inverter. For CMOSIS5 technology, the minimum length is 0.6um and 0.9um for the width. We then size the PMOS and NMOS transistors so that they have the same trans-conductance value.

248.0=n

p

μμ , based on CMOSIS5 V2.1 model

The ratio of transistors for CMOSIS5 technology is:

4

248.01

≈ , rounded to nearest whole number

The minimum drawn length for the NMOS transistor is set to 0.6um and 2.4um for its width. Correspondingly, a PMOS transistor has a drawn length of 0.6um for a width of 9.6um.

10

Page 11: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

3.2 Maximum Transistor Current The maximum saturation PMOS current is obtained for VIL and VOH:

22)( /74.48,)9213.03.3(

216 VAKKI P

PMAXD μ=−=

mAI MAXD 2.2)( =

Similarly, maximum saturation NMOS current obtained for VIH and VOL:

22)( /47.196,)6566.03.3(

24 VAKKI N

NMAXD μ=−=

mAI MAXD 7.2)( =

3.3 Inverter Gate Capacitance In order to get an approximate propagation delay for the inverter, we first evaluate the Cgp and Cgn. We proceed to get the total area of both transistors and multiply that value with Cox to get the total gate capacitance. The value gives an indication of how much capacitance is placed on a node by the input of an inverter. Area of NMOS transistor = 0.6x2.4 = 1.44 μ m2

Area of PMOS transistor = 0.6x2.4x4 = 5.76 μ m2

Assuming Cox to be 3.59fF, Gate capacitance of NMOS = 1.44 x 3.59fF = 5.04fF Gate capacitance of PMOS = 5.76 x 3.59fF = 20.68fF Total gate capacitance = 25.7fF (rounded to 25fF) 3.4 Inverter Dynamic Characteristic: The performance of the system depend on the switching frequency allowed by each component and hence by the propagation delay of the simple inverter model. The delay also depends on the amount of capacitive load being driven. For our calculations, we will first assume a loaded output of one fan-out (25fF).

11

Page 12: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

The gate delay is obtained by:

2phlplh

pd

ttt

+= , Where tphl and tplh represents propagation delays between low-to-high and

high-to-low signal with the output of the inverter. The low to high transition propagation delay ( ) of a signal at the output of an inverter driving a single load (fan-out of 1) is obtained from the equation:

plht

DD

tp

ddp

Lplh V

Vpp

pp

pVCt =−+

−−

−= )],2019ln(5.0

)1()1.0([

)1(2

β

65.135',16,279.0,7.25 −==== eKLWpfFC pL

))]279.0(2019ln(5.0)279.01()1.0279.0([

)279.01)(3.3)(5.135(16)7.25(2

6

15

−+−

−−

= −

eFet plh

pst plh 4.15= Similarly, fall time ( ) at the output of an inverter driving a single load is obtained:

phlt

)]2019ln(5.0

)1()1.0([

)1(2 n

nn

nVCt

ddn

Lphl −+

−−

−=β

, DD

tn

VVn =

62.546',4,199.0,7.25 −==== eKLWnfFC NL

)]199.0(2019ln(5.0)199.01()1.0199.0([

)199.01)(3.3)(2.546(4)7.25(2

6

15

−+−

−−

= −

eFet phl

pst phl 2.13=

pst pd 3.142

2.134.151 =

+=

12

Page 13: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

Similarly, the rise time and fall time for the inverter is obtained from the following equations:

DDp

Lr V

CKt*β

= , DDn

Lf V

CKt*β

= where K 3.5

psfFtr 343.3*10*8.779

255.3 6 == − , psfFt f 7.333.3*10*9.785

255.3 6 == −

pst pd 9.164

7.33342 =

+=

3.5 Comparative Delay

Calculated (ps) Simulated (ps) tplh 15.4 15.7 tphl 13.2 13.8 tpd 14.3 14.7 tr 34 32 tf 33 31

13

Page 14: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

3.6 Power Consumption An important aspect of ASIC design is reduction of power consumption to a minimum. The energy used in a circuit can be either useful or be wasted unnecessarily as heat and leakage current through the substrate. The former can be avoided by employing some good design practices:

• The algorithm is reduced so as to have minimum amount of Boolean functions

• Transistors are drawn to the minimum sizes (some special cases exist)

• Interconnect are reduced to a minimum • Number of metal layers used is limited to two or less

The knowledge of the system power consumption is critical in designing the system properly. For the interconnection, we need to know before hand an approximate value of how much supply current will be distributed to logic gates. This will help in sizing the power rails properly. For a single inverter sized to the smallest length and width, we can calculate, for our process, the amount of energy used up in the switching process.

The power is usually distributed among three components:

circuitshortdynamicleakagetotal PPPP −++= , where DDtotaltotal VIP *=

)1( −= kTqV

Sleakage eII , nAeAreaII mVV

drainsleakage 5.0)1(** 257.0

≈−=

)()2(12

,3

p

frtddcircuitshort t

tVVI −=−

β ,

AWp circuitshort μμ 49or 162)3.14

12())91.0(23.3(12

10*779 36

≈−=−

fVCI Ldynamic2= , AIdynamic μ3410*125*3.3*10*25 6215 == −

Estimated total power per inverter with 25fF load= 84uA* 3.3V = 0.28mW at 125MHz.

Basic gate (50MHz) Equivalent inverter number Approximate power (mW) NAND2 2 0.5 NAND3 3 0.75

DFF 16 4 EXOR3 18 4.5

SYSTEM 286 30.5mW

14

Page 15: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

3.7 NAND2 Gate Design Static analysis: DC performance of the NAND2 gate is based on the basic inverter model with similar operating voltages. Dynamic analysis: For the dynamic analysis, equal tr and tf are needed and therefore, the transistors are sized according to the inverter model. The cases where transistors are in series are dealt with differently and for a 2 input gate, the width of the transistors (NMOS) is sized twice more than the basic inverter model. The transistors are also implemented using minimum width and length for the NMOS with the proper multiplication factor for PMOS. The output loading capacitance used to model the gate delay is representative of a unit fan-out (25fF). This includes parasitic capacitances of the driver transistors as well as the gate capacitances of the next stage transistors.

LPr CRt *2.2= , ,LNf CRt *2.2= wdndpgngpL CCCCCC ++++= , fFCL 25=

Ω≈−

= 539)()(

1

tpSGPP

P

VVLWK

R , Ω≈−

= 481)()(

1

tnGSNN

N

VVLWK

R

pSfFtr 6.2925*539*2.2 == , pSfFt f 5.2625*481*2.2 ==

pStp 35.134

)5.269.26(=

+=

The model can be extended to cover dynamic analysis for the NAND3 model: We will size the three series NMOS transistor with three times the width of our basic inverter model. For this the rise and fall time will be the same as well as the propagation delay.

15

Page 16: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

3.8 EXOR3 Gate Design The EXOR3 cell element is based on CMOS pass transistor logic. The device sizes are not as important as a factor with respect to pure CMOS logic. For this reason, we adjust the transistors to the minimum length and width in order to reduce delay and hence assure minimum power dissipation. The static operation of the circuit gives ideal VOL and VOH as CMOS pass gates are used. The dynamic property is more elaborate and requires an analysis.

Fig. 11 EXOR3 parasitic capacitances

The propagation delay for the circuit is obtained by seeing that the worst case delay occurs when either logic ‘0’ or ‘1’ is applied to an input and discharges or charges the resulting capacitive network through two transmission gates in series. As such, tp:

2,2

)1(69.0 =+

= nnnCRt eqeqp , eqeqp CRt 07.2=

NPeq RRR //= , Ω=+

= 254480540480*540

eqR

DNDPeq CCC 22 += , JSWJD CPDCADC ** +=

fFmmmCDP 17]10*89.2*)8.10(2[]10*35.9*6.9*2.1[ 104 =+= −− μμμ fFmmmCDP 2]10*00.5*)6.3(2[]10*62.5*4.2*2.1[ 114 =+= −− μμμ

pSfFfFtp 20)217(2*254*07.2 ≈+=

16

Page 17: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

3.9 Data Flip-Flop Gate Design The flip-flop will have static characteristics similar to those found in a NAND2. The dynamic characteristics will depend on the propagation delay of the internal components contained within the flip-flop.

Fig. 12 D Flip-Flop NAND gate implementation

For a flip-flop, the dynamic parameters needed are setup and hold time as well as the propagation delay at the output. The dynamic characteristics are obtained as follow:

nandpupupsetup tttt −−− =+= 203 pstsetup 7.26= nandpuphold ttt −− == 1 psthold 4.13=

nandpupupupupcq tttttt −−−−− =+++= 45423 pstcq 4.53= nandpcqsetupp tttt −=+= 6 pstp 1.80=

3.10 Four Bit Data Register The register consists of four data flip-flops connected in parallel which will latch data on the rising edge of the common clock. The performance of the register is identical to its flip-flop subcomponent. As such, the static as well as the dynamic characteristics remain unchanged. There will be an amount of clock skew that will be introduced because of the clock network distribution and this will be address in the layout section of the project.

nalcombinatiocqsetupskewcycle ttttt ++≥+

17

Page 18: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

4. SYSTEM LEVEL DESIGN The counter is implemented at the transistor level and a gate level representation helps to see details about performance and power issues usually hidden at the transistor level. A block diagram hence simplifies the representation of such a system. For example, there will be skew introduced in the clock distribution network but because we are presenting the register as a black box, from the system level perspective, no skew is present. By locating the critical path in the design, we can derive an expression for the maximum frequency of operation for the counter. As the performance of the system depends on the critical path, optimizing this path by removing delay components from it will increase the maximum frequency of operation of the system.

Fig. 13 System level delay model

An inspection of the circuit reveals that the critical path consists of the output of REG4, the cascaded AND-OR of the two SOP3TERMS cells and the EXOR3. We have to take also into account the setup time of REG4.

4334min 2 REGsetupEXORpTERMSSOPpREGCQp ttttt −−−−− +++= To get an indication of the minimum propagation delay of the circuit, we can now replace the values obtained theoretically before:

pspspspspstp 5.1537.26204.534.53min =+++=−

18

Page 19: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

4.1 Input Output Buffer Design The output pads are driven by buffers that will give optimum rise and fall time for the output signals. The approach to size the buffer properly begins first by finding the output to input capacitive ratio. Then, an even number for the number of inverters is selected in order to get true outputs at the end of the circuit (2 inverters). For the output driving pad and protection circuit, we will assume that it will have a total capacitance of 625fF. The output capacitance of a unit sized inverter is 25fF. The ratio hence gives 25. Using the following formula:

Yan = , where Y is 25 and n is 2

5252

==

aa

There will be a total of four output buffers in the design to drive the output pads corresponding to Q0 to Q3.

Fig. 14 Input/output buffer driver circuit The input pads will also connect to buffers that will consequently drive high capacitive networks. Such loads are the clock, the reset and the up-down select networks.

19

Page 20: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

4.2 Schematic Entry The proposed schematic was first verified by using the pencil and paper approach with a check on the next state equations. Once completed, the main idea then was to break the circuit into sub-modules that could be integrated and tested easily. Integration was simplified as some of the modules were duplicated blocks such as the D Flip-Flop and the Exclusive-Or gates.

Fig. 15 System block diagram

The tests were also straight forward as the most complicated blocks had at most 3-input signals (SOP3TERMS, DFF, EXOR3) and hence required an 8 combinations test vector to fully describe the output behavior.

SOP3TERMS EXOR3 D FLIP-FLOP A B C F A B C F CLR CLK D Q Q' 0 0 0 0 0 0 0 0 1 0 0 Q Q' 0 0 1 0 0 0 1 1 1 0 1 Q Q' 0 1 0 0 0 1 0 1 1 1 0 Q Q' 0 1 1 1 0 1 1 0 1 1 1 Q Q' 1 0 0 0 1 0 0 1 1 ↑ Q Q+ Q'+ 1 0 1 1 1 0 1 0 1 ↓ Q Q Q' 1 1 0 1 1 1 0 0 0 X X 0 1 1 1 1 1 1 1 1 1 1 X X Q Q'

20

Page 21: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

4.3 Schematic Test Results The following drawings give an overview of the results obtained for the final design of the counter in schematic form. Please note that this form misses parasitic elements that the final layout provides and hence gives a rather optimistic idea about the timing characteristics.

From the above simulation result, we can verify that the circuit works properly by looking at the dividing property of counters. Q0 is faster than Q1 by a factor of two and so is Q1 with respect to Q2 and so on. This is the kind of pattern to look for when verifying for the proper functionality in a counter.

21

Page 22: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

4.4 Circuit Optimization One of the aims of the project is to enhance the circuit timing performance as well as power consumption through a set of design rules. The following guidelines have been observed while designing the counter. Design level

• Algorithmic level optimization through Boolean reduction The system has a maximum of 3 levels of combinational logic

• Gates compaction through NAND gates conversion

All sum-of-products are transformed into NAND only gates

• Exclusive-or gates built using pass-transistor for minimum power The ex-or gates are built using CMOS PTL only

• Logic gate re-use

The only inverter used is absorbed into the D Flip-Flop Implementation level

• Use of minimum number of metal layers Metal-1 and 2 are the only layers used in the project

• Proper transistor sizing using smallest feature size

Minimum transistor size with 4=rμ

• Optimization of critical path of circuit Minimum length interconnect with no via

• Reduction of clock skew

Symmetric interconnect layout to clock tree of D Flip-Flops

22

Page 23: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

5. CIRCUIT LAYOUT The methodology employed for layout was in two parts. First, some pre-layout work was done with the help of drawing software to get an idea of the relative locations of the blocks and the amount of interconnect between the blocks. The use of layers within the software enabled the creation of routes more or less similar to the one which would later be drawn in the CAD tool. Once completed, the system was verified for errors by inspection and the next step, which was the actual layout, was initiated.

Fig. 16 Pre-layout block diagram

The circuits shown here were all drawn with the help of the Adobe Photoshop software. It allowed an evaluation of the amount of routing required for the actual layout, between the main building blocks. It also helped in organizing the layout in horizontal and vertical routing connections. The above-left drawing shows the complete system without output drivers. The top-right shows a single D Flip-Flop and finally, the bottom-right shows the sum of product logic.

23

Page 24: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

5.1 Clock Skew The timing performance of the system depends to some extent, on the relative location of Flip-Flops to each other. This influence the arrival time of the clock edge as distant electric nodes will change at a later time, compared to nodes that are closer to the clock buffer. The net effect is called clock skew and should be reduced to a minimum. Layout symmetry is an important factor to consider in reducing this effect. This is achieved by using clock trees which have constant propagation delay through out the tree network. The design employ this concept of equal interconnect length to reduce the amount of skew.

Fig. 17 Clock tree

The clock skew was measured by doing simulation and getting timing results at points of entry for the four D Flip-Flops contained in the register. Note that the skew is added to the total delay, as it goes against the flow of data in the circuit and that it is a maximum, worst case value.

24

Page 25: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

5.2 Cells Layout The layout was done for all basic elements and they were eventually integrated within the same project. The first components laid were the NAND2 and NAND3 cells, being straight forward in concept and forming the basis of other components. The sum of product logic block, SOP3TERMS was next to be drawn and made extensive use of the NAND2 and NAND3 cells. Finally, the exclusive-or, D Flip-Flop and the 4-bit register were all drawn.

Fig. 18 Pads layout

5.3 System Testing and Simulation Results The complete test fixture includes three signal generators that provides clock, reset and up-down control. They are programmed to give specific up and down combinations and hence checks for the proper functionality of the circuit. The result is compared with an expected behavior that is known before the test starts. The following test scenarios were performed on the final circuit, with a 20fF load connected at the output nodes:

• Test for reset • Test for up counting • Test for down counting • Test for maximum frequency of operation

25

Page 26: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

The results obtained included values for tr, tf, tp and Imax.

Fig. 19 Test fixture

Fig. 20 Counter result

26

Page 27: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

Two cells were analyzed for their DC and AC characteristics as required for the project. Parameter SOP3TERMS NAND3 Parameter SOP3TERMS NAND3

VIL (max) 2.3V 1.5V tr (ps) 216 251 VIH (min) 2.5V 2.3V tf (ps) 316 340

VOL (max) 0.14V 0.18V tp (ps) 133 148 VOH (min) 3.2V 2.9V Imax (uA) 792 161

NML 2.16V 1.32V Pmax (mW) 2.61 0.53 NMH 0.7V 0.6V

Fig. 21 NAND3 cell DC characteristics

Fig. 22 SOP3TERMS cell DC characteristics

27

Page 28: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

6. PERFORMANCE EVALUATION The aim of the project being first to develop a fully functional up-down counter, the utmost performance is to verify for the proper behavior of the counter which is; does it do count up and down when needed? From the simulation results, we know that it does the counting in both directions according to the specifications. The next objective was to develop a reliable and performing counter. A performing counter is one that has higher maximum clock frequencies when compared to other counters while maintaining lower power dissipation. For us to check this property, we needed to have a model of a 4-bit counter so as to do the comparison.

6.1 System Specification

• Maximum Clock: 250MHz • Power supply: 3.3V • Power consumption: 19.5mW • Area: 22160um2 • VIL (max): 0.61V • VIH (min): 2.2V • VOL (max): 0.062V • VOH (min): 3.15V • Noise margin low: 0.55V • Noise margin high: 2.54V • tr: 242.4ps • tf: 255.3ps

28

Page 29: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

7. CONCLUSION

Throughout the project, emphasis has been placed on performance and a balance between the three main design criteria: power, speed and size. The design minimizes delay through precise adjustment of the critical path metal interconnect. Power is also reduced to a minimum as a result of using minimum sized transistors. The only low key is that the design takes up a lot of area, in terms of silicon. This is partly because more time is required in order to do iterations in the layout which then results in reduction in the routing area but currently, time does not permit such a luxury. Overall, the system performs very well and meets all the basic requirements. Speed of up to 250MHz has been obtained through simulation with loads of 20fF on the output. The power dissipation at that frequency was a mere 20mW which more or less resumes what the project is all about.

29

Page 30: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

APPENDIX A: SCHEMATIC SIMULATIONS TEST RESULTS

30

Page 31: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

APPENDIX B: CELLS LAYOUT

31

Page 32: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

32

Page 33: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

33

Page 34: 1. INTRODUCTION 1.1 Detail Descriptionusers.encs.concordia.ca/~asim/COEN 451/Projects/Project_Jacque.pdf1.1 Detail Description ... combinatorial and sequential logic building blocks

REFERENCES 1. Neil H. E Weste, Principles of CMOS VLSI Design 2. Jan M. Rabaey, Digital Integrated Circuits, A Design Perspective, Prentice- Hall, 1996 3. Morris Mano, Digital Design, 3rd edition 4. Sedra Smoth, Microelectronics Circuit, Fifth Edition 5. M.J. Smith, Application Specific Integrated Circuits 1997 edition 6. Professor Asim Al-Khalili for his precious help and good lecture notes

34