arithmetic and logic circuits using sub-threshold pass-transistor logic for ultra-low energy...
Post on 11-Feb-2018
221 Views
Preview:
TRANSCRIPT
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
1/60
University of Southampton
Faculty of Physics and Applied Sciences
School of Electronics and Computer Science
Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor LogicFor Ultra-Low Energy Applications
By
Choudhury Md Salim Ul Haque Salmee
21st
September, 2012
A dissertation submitted in partial fulfillment of the degree of
MSc Microelectronics Systems Design
By examination and dissertation
Project Supervisor: Dr Tom J. Kazmierski
Second Examiner: Dr Koushik Maharatna
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
2/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
2
ABSTRACT
This dissertation paper summarises the research and design work carried out during an MSc
project which was aimed to develop practical arithmetic and logic circuits being integrated into an
Arithmetic Logic Unit for energy constrained applications. The method adopted for ultra-low power
design was sub-threshold pass-transistor style logic. The project started with a wide range ofliterature review, including research publications, focused on the performance of various pass-
transistor logic styles in terms of speed, power dissipation and area. The circuits of this project were
developed both in CMOS and PTL style in order to provide a power comparison between the two
styles. Some of the PTL logic circuits designed in this project were modified in terms of transistor size
and design style in order to ensure the smooth power efficient operation in the sub-threshold
region. Comprehensive simulations were carried out to characterise the circuits in terms of
propagation delay and power consumption. Simulations were conducted for supply voltages below
and around the threshold with different ambient temperatures and fan-outs. The results show that
the implementation of sub-threshold PTL circuits to develop a complex hierarchical structure such as
an ALU is feasible. Furthermore, comparative analysis and assessment of the results suggest that for
sub-threshold design, PTL logic is power efficient for large scale circuits such as ALU compared to itsCMOS counterpart. Measurements of the 8-bit ALU structure show that for worst case simulation
conditions such as high sub-threshold supply and extreme temperature, the PTL version consumed
153.15 pw of dynamic power whereas the CMOS version consumed 314.21 pw which is two times
more than the earlier one. Maximum power consumption of such design is restricted to a few
hundreds of pico watt power which ensures the ultra-low power design of a system. However,
power efficiency of PTL is gained at the cost of circuit performance. Despite of that, such system is
beneficial for numerous applications for which power is a scarce resource and performance is not
the primary concern.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
3/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
3
ACKNOWLEDGEMENTS
First of all, I would like to take the opportunity to express my sincerest gratitude to my
project supervisor, Dr Tom J Kazmierski for allowing me to do the project which is related to one of
the leading research topics in the field of digital design and also related to my professional interest
low power design. His astute supervision and proper guidance helped me to make this projectpossible. Furthermore, his knowledge on the topic and continuous support during the project
encouraged me to conduct in-depth research.
I am grateful to Dr Koushik Maharanta, the project second examiner who pointed out some
facts about the project that guided me to revise my work with more accuracy. I am also grateful to
all the lectures of the modules that I took during my MSc study. Especially I would like to mention Dr
Koushik Maharatna and Iain McNally whose lectures and laboratory sessions were very helpful to
conduct my research properly. I would also like to thank Mr Robert Rudolf, a full-time research
graduate from Electronics and Electrical group for his assistance with Cadence simulation.
I would like to acknowledge the library facilities provided by the University of Southampton.
I am also thankful to ECS School for providing computer access with state-of-the-art EDA tools andscientific publications.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
4/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
4
LIST OF CONTENTS
ABSTRACT ................................................................................................................................................ 2
ACKNOWLEDGEMENTS ........................................................................................................................... 3
CHAPTER 1 INTRODUCTION .................................................................................................................... 6
1.1. Motivation .................................................................................................................................... 6
1.2. The Project ................................................................................................................................... 7
1.3. Results and Benefits ..................................................................................................................... 8
CHAPTER 2 BACKGROUND AND PREVIOUS WORK ................................................................................. 9
2.1. Energy Constraint Applications .................................................................................................... 9
2.1.1. Micro-sensor Network and Nodes ........................................................................................ 9
2.1.2. Radio Frequency Identification ............................................................................................. 9
2.1.3. Low Power Digital Signal Processor and Microcontroller Unit ............................................. 9
2.1.4. Energy Harvester ................................................................................................................... 9
2.2. Sub-Threshold Operations of MOSFET and CMOS Logic Gates ................................................. 10
2.2.1. Strong Inversion .................................................................................................................. 10
2.2.2. Weak Inversion ................................................................................................................... 11
2.2.3. Static CMOS Inverter in Sub-Threshold Operation ............................................................. 12
2.2.4. Application, Advantages and Demerits of Sub-Threshold Logic ......................................... 13
2.3. Pass Transistor Logic (PTL) ......................................................................................................... 15
2.3.1. Basic Operations Principle .................................................................................................. 15
2.3.2. Complementary Pass-Transistor Logic (CPL) ....................................................................... 16
2.3.3. Dual Pass-Transistor Logic (DPL) ......................................................................................... 17
2.3.4. LEAP and Other PTL Styles .................................................................................................. 17
2.3.5. Merits and Demerits of PTL ................................................................................................ 18
2.4. Sub-Threshold Pass-Transistor Logic.......................................................................................... 18
2.5. Basic Circuits .............................................................................................................................. 19
2.5.1. PTL Logic Circuits ................................................................................................................. 19
2.5.2. CMOS Logic Circuits ............................................................................................................ 21
2.6. Arithmetic Logic Unit (ALU)........................................................................................................ 23
2.6.1. ALU Design .......................................................................................................................... 23
2.6.1.1. Tree Structure .................................................................................................................. 24
2.6.1.2. Chain Structure ................................................................................................................ 24
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
5/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
5
CHAPTER 3 BASIC CIRCUITS DESIGN AND CHARACTERISATION ........................................................ 26
3.1. Design ......................................................................................................................................... 26
3.1.1. PTL Circuit ........................................................................................................................... 26
3.1.2. CMOS Circuits...................................................................................................................... 26
3.2. Characterisation ......................................................................................................................... 28
3.2.1. Propagation Delay Measurement ....................................................................................... 29
3.2.2. Power Consumption Measurement .................................................................................... 30
3.3. Presentation of Results .............................................................................................................. 31
3.3.1. Propagation Delay ............................................................................................................... 31
3.3.2. Power Consumption............................................................................................................ 33
3.4. Result Analysis ........................................................................................................................... 36
CHAPTER 4 ARITHMETIC LOGIC UNIT DESIGN, POWER MEASUREMENTS AND RESULTS ANALYSIS . 38
4.1. ALU Design ................................................................................................................................. 38
4.1.1. 1-Bit PTL Design .................................................................................................................. 38
4.1.2. 1-Bit CMOS Design .............................................................................................................. 44
4.1.3. 8-Bit PTL Design .................................................................................................................. 45
4.1.4. 8Bit CMOS Design ............................................................................................................. 47
4.2. Power Consumption Measurements and Results ................................................................... 47
4.2.1. Simulation Setup ................................................................................................................. 47
4.2.2. Results ................................................................................................................................. 48
4.3. Result Analysis ........................................................................................................................... 54
CHAPTER 5 CONCLUSION AND FUTURE WORK .................................................................................... 56
APPENDICES .......................................................................................................................................... 58
Appendix A Project Gantt Chart ..................................................................................................... 58
Appendix B - Design Files .................................................................................................................. 58
Appendix C - Detailed Simulation Data ............................................................................................. 58REFERENCES .......................................................................................................................................... 59
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
6/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
6
CHAPTER 1 INTRODUCTION
Power consumption is a major concern for integrated electronic circuits and devices. It
influences the design and fabrication of such circuits and systems in two aspects. Firstly, power
dissipates in the form of heat which affects the performance of a chip. It also requires special cooling
and packaging which is expensive. Secondly, the increasing number of mobile systems and energyconstrained applications such as an energy harvester, micro-sensor nodes and self-powered Radio
frequency identification (RFID) require low power consumption to maximise their battery life.
Therefore, there have been on-going researches on a multiple level of systems such as behavioural,
architecture, logic and technology level.
1.1. Motivation
The previous project [1] and [2] on sub-threshold pass-transistor logic provided solid
assessment based on basic logic circuits and adder circuits that sub-threshold PTL circuits are more
energy efficient than CMOS counterparts with the circuit propagation delay being trade off with low
power consumption. This research project concentrates on developing more complex and practicalarithmetic and logic circuits based on sub-threshold PTL in a view of minimizing the energy
consumption of a digital circuit system (processor) for ultra-low energy applications.
For energy constrained applications, standard practice is to use conventional
microcontroller. These microcontrollers have far more contemporary and multipurpose functionality
with the capability of operating in tens to hundreds of megahertz of clock frequency. With multiple
general purpose input-output terminals, these microcontrollers also have very precise and high-
speed ADCs. All these features and flexibility of use lie behind the obvious usage popularity of such
microcontrollers in a wide range of applications. Energy consumption of these microcontrollers is
not a serious issue for typical household, industrial or automotive applications. However, for energy-
constraint applications where power is a scarce resource, this power consumption is a significant
factor.
The project investigates to find more energy efficient cohesive circuits for designing the
building blocks of a customary processor in deep transistor level. There were three aspects of
research. Firstly, the research focused on PTL circuits only instead of CMOS logic circuits since many
publications and research [3], [4], [5], [6] and [7] concluded that PTL has lower leakage and require
less number of transistors compared to CMOS logic. The second aspect is the use of transistor in sub-
threshold region as a method for low power consumption [8], [9], [10] and [11]. Transistor operating
in the sub-threshold region consumes a very small amount of energy, but at the cost of circuit
performance in terms of speed [11]. However, for the aforementioned energy-constraint
applications, performance is insignificant and primary concern is power consumption. Chapter 2
includes the details of the sub-threshold operation of CMOS logics and other prominent energy-constraint applications. Lastly, the study includes energy efficient structural methods for complex
circuits [29], [30] and [31].
The research is motivated by the previous project [2] work which shows that PTL logic
circuits are more energy efficient than CMOS logic and PTL can operate in sub-threshold voltage.
Moreover, other studies [4], [6] and [7] conceptualized that PTL can be operated in sub-threshold
voltage. However, the project [2] validates sub-threshold PTL only for a limited number of basic logic
circuits and relatively smaller hierarchical structures. Positive outcome of [2] could effectively lead
towards building larger circuit blocks and hierarchical structures and ultimately to the development
of an ultra-low power digital system (processor). If successful, this can be advantageous for energy-
constraint applications in two ways. Firstly, it will make the design simpler with a smaller number ofcircuits and devices. Secondly, energy consumption will be more efficient which can ensure ultra-low
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
7/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
7
power consumption of the system. To the knowledge of the author, apart from the previous project
[1] and [2], there has been a very insignificant amount of studies and publication in sub-
threshold PTL.
1.2. The Project
The whole design project was conducted with Cadence AMS 0.35m process design kit. The
MOSFET transistors used in this project are obtained from this PDK built-in library where the
transistors are fully characterised for all three regions of operations including sub-threshold.
Therefore, the simulation results are asserted to be valid and accurate.
The project started with developing a comprehensive collection of PTL and CMOS basic
circuits for large scale design structures. Therefore, a total of 9 basic logic circuits were added to the
existing strings of PTL and CMOS circuits from [2]. Circuits were chosen and designed carefully in
order to develop efficient and hierarchical structures of 1-bit and 8-bit ALU. All the PTL circuits were
thoroughly characterised in terms of propagation delay and power consumption for different fan-
outs, ambient temperatures and sub-threshold supply voltages. The characterisations were carried
out for all the PTL circuits and two CMOS circuits only. This is because the project goal was to
develop more advanced and larger PTL circuits and also to avoid the repetition of the previous
project work on CMOS circuits. Based on basic circuits, 4 versions of 1-Bit PTL ALU with different
style and functionality were developed and characterised for power consumption. Development of
the latest version was encouraged by the successful implementation of the earlier ones. Design of 8-
bit PTL ALU was based on the latest version of 1-bit ALU, which is explained in chapter 4. The 8-bit
ALU was designed both in PTL and CMOS logic and the two structures were compared for power
consumption in different temperatures and supply voltages. A total of 7 PTL hierarchical circuits and
4 CMOS hierarchical logic circuits were created during the ALU design process. Additional 53 PTL test
circuits and 25 CMOS test circuits were designed for simulation purpose. An overall of more than
2500 simulations were executed for design and characterization during the course of the project.
The dissertation paper describes all the research and project works that were carried out
during the course of this project. The project started with a wide range of literature review and
study of the previous project [2] which is included in chapter 2. Literature review comprises of sub-
threshold operations of MOSFET and CMOS inverter, applications, benefits and disadvantages of
sub-threshold operations. It also summarises the contemporary and major research findings on sub-
threshold design. The review continued with different PTL design styles with their advantages and
disadvantages. A brief section in this chapter includes the sub-threshold PTL operation. It also
contains a review of the basic PTL and CMOS circuits from [2]. Different design methods for ALU
were also a part of the literature review.
Chapter 3 includes the design work of extended clusters of PTL and CMOS basic circuits withbrief descriptions of functionality and features. All the PTL circuits including two CMOS circuits were
characterised under different simulation conditions which includes different supply voltages and
temperatures for different fan-out circuits. The result of characterisation - propagation delay and
power consumption (static and dynamic) of PTL circuits are presented with explanations.
The paper continues with practical design work for 1-bit and 8-bit ALU in chapter 4. It
provides design details of 1-bit ALU and power comparison between different versions of 1-bit ALU
with explanations for the best possible version, selected for 8-bit hierarchical ALU design. Along with
the detailed design architecture, the chapter presents power comparison of the 8-bit ALU in PTL and
CMOS structure and concludes with result analysis.
The paper finishes with project outcome and suggestions on prospective future work which
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
8/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
8
is in chapter 5. A grant chart with detailed timing on project progress and development is included in
appendix A. Appendix B includes lists of all the design files along with the Cadence design files.
Appendix C is provided with detailed simulation data. Both the appendix B and C are available in the
submitted zip file.
1.3. Results and Benefits
The result shows the successful implementation of sub-threshold PTL logic in a complex and
hierarchical design such as ALU. As mentioned earlier, the previous project [1] and [2] validated this
method on basic logic circuits only and no other researches provided a solid assessment of the
practical feasibility of using PTL in sub-threshold. Moreover, achievements of this project along with
[2] directly oppose the suggestion of other research [12] that sub-threshold PTL is unfeasible in
principle.
The ALU developed in this project is one of the major building blocks of a processor. The
project requires a lot of research work and in-depth analysis which was beyond the scope of this
project due to the specific goal and time constraint in MSc degree. The successful implementation of
this method will be an essential development in terms of power consumption for ultra-low energyapplications. The challenging part is the effective implementation sub-thresholds PTL for other major
building blocks to successfully develop an ultra-low power digital system (processor), which
demands a significant amount of research and design work.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
9/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
9
CHAPTER 2 BACKGROUND AND PREVIOUS WORK
2.1. Energy Constraint Applications
The following section includes a brief description of prominent and contemporary
applications that can be benefited from ultra-low power design.
2.1.1. Micro-sensor Network and Nodes
A micro sensor node is a node in a micro-sensor network capable of sensing, computing and
communication functionality. Typically, tens of thousands of spatially distributed micro-sensor nodes
constitute a wireless micro sensor network for sensing, processing and relaying information data to
the end user [11]. There have been many on-going researches on the practical implementation of
such network and substantial proposed applications are health monitoring, automotive sensing,
habitat and structural monitoring [11]. The performance requirements for this application are very
low, for example, measuring the rate of change of data for health monitoring is in the order of few
second to a minute [11]. The battery lifetime required for micro-sensor network is very long since it
is impossible to change the battery of such nodes frequently. Therefore, low performance andlonger battery life requirement make the micro-sensor network a perfect candidate for ultra-energy
technology implementation.
2.1.2. Radio Frequency Identification
Radio frequency identification (RFID) system is used to track and identify an object by means
of an RFID tag attached to the object [11]. RFID tags use radio frequency to communicate with the
end user. These tags are being used for many years and flexibility of use has spawned in to many
applications such as medical implants, tracking automobiles, pharmaceutical goods, livestock and
pets, smart credit cards and smart keys for automobiles. An RFID tag usually has antenna and other
communication circuits [11]. The functionally of an RFID tag requires very simple logic processing[11]. An active RFID tag transmits signals to the reader using energy from the battery. Extra energy
from battery could ensure extended processing. Moreover, low powered design means lower energy
for communication and hence communication distance could be longer. On the other hand, a
passive tag operates and also most often energized by the electromagnetic signal it receives from
the reader. As a result, a passive tag is smaller in size and independent of energy consumption. By
minimizing the digital processing power, it would require less transmission power from the reader
and makes the communication distance longer.
2.1.3. Low Power Digital Signal Processor and Microcontroller Unit
Portable applications have successfully used Texas Instrument (TI) C5xx family of Digital
Signal Processor and the T1 MSP430 microcontroller unit for metering, measurement andinstrumentation purposes [11]. Modern day portable devices, such as mobile phones and PDAs
require a dynamic range of power consumption and performance. Such applications require high
performance digital signal processor or microcontroller unit during active mode. When in standby
mode, they urge for limited processing and low power consumption in order to extend the battery
life. Although in a variety of applications for both active and standby mode, devices are required to
be optimized for power consumption.
2.1.4.Energy Harvester
Energy harvesting is the source of energy for small wireless electronic autonomous devices
like wireless sensor networks [13]. By this process, energy is derived from external sources such asthermal, solar, wind and kinetic energy into electrical energy for circuits. A wide range of low power
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
10/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
10
applications can be benefited from the energy harvesting process provided that there is abandoned
energy source and sufficient amount of energy can be derived from the source for the required
operations [13]. Figure 1 shows a block diagram of a typical self-powered wireless sensor node using
piezoelectric vibrating energy harvester [13]. The system includes a microcontroller unit (MCU) with
integrated antenna for transmission and sensors for collecting information from the environment.
The supply voltage required for the MCU is 3.3 volt.
Figure 1 a) Block Diagram of a Self-Powered Smart Sensor Node with Energy Harvesting Method b)
Different Node Voltages with Time (Adapted from [13] and reprinted (b) from [2])
The derived energy from the harvester is rectified and fed to a super capacitor with nominal
capacitance of milli-farads to tens of farads [13].It takes hours to charge the capacitor to 1-1.2 v
(figure 1b) which allows the voltage regulator to start. To reach a fully functional energy level for the
system, it takes more than 26 hours of energy harvesting. Moreover, the voltage regulator requires a
cold start circuit [13] for successful operation of the system. All these factors are disadvantageous
since the system consumes time and energy and also it requires the additional components which
implies higher cost.
Therefore, the energy harvesting process is a prime candidate for ultra-low power design
which can ensure low power consumption with relatively faster operation time and also make the
design simpler hence cost effective.
2.2. Sub-Threshold Operations of MOSFET and CMOS Logic Gates
2.2.1. Strong Inversion
The requirement for the normal operation of a MOSFET is the gate voltage to be
bigger than the device threshold voltage [14]. The region of this operation can be referred to as
strong inversion operation [14].
VGS > VT, strong inversion requirement (1)
There are two regions of operation for strong inversion triode and saturation region. Both
region of operation is controlled by the bias voltage of the device. For an nMOS transistor,
Expression (2) and (3) shows the condition for triode and saturation region operation consecutively.
VDS < VGS VT (2) VDS VGS VT (3)
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
11/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
11
Triode Region
Saturation Region
0 0.3 0.6 0.9 1.2 1.5 1.8
400
300
200
100
0
IDS (A)
VDS (V)
VDS = 1.8 V
VDS = 1.5 V
VDS = 1.2 V
VDS = 0.9 V
Figure 2 Current Voltage Characteristics of an Ideal NMOS transistor [14]
In triode (linear) region, the device behaves like a linear resistor whose value is controlled by
VGS [14]. In saturation, the device current reaches a maximum value and the device is said to be
pinched off [14].
2.2.2. Weak Inversion
A MOSFET is said to be in cut-off region for gate voltages less than the device threshold
voltages. In theory, there is no current flow. However, in practical a weak inversion layer exists
which causes the flow of diffusion carriers in the channel [11]. Therefore, the device current IDS
exhibits an exponential dependence on VGS [15]. This region of operation is called the sub-threshold
regime.
VGS < VT, weak inversion requirement (4)
The sub-threshold current is mainly contributed by diffusion current [11]. Expression (5) represents
the basic equation for sub-threshold current.
=Io exp (5) [11]
= o ( 1)2, drain current at VGS VT [11]
= 1 + (6) [11]
Expression (5) shows that the sub-threshold current is strongly corresponding to thermal
voltage = . It also depends exponentially on VGS. Expression (6) shows the sub-thresholdslope n which depends on device capacitance.
An nMOS transistor operating in different gate voltage, VGS below threshold voltage
(approximately 0.57V) and the corresponding drain current IDS response is shown in figure 3. It
implies that nMOS can operate in the sub-threshold region [2].
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
12/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
12
Figure 3: VGS versus IDS for nMOS Transistor at VDS = 0.5v in 0.35m AMS Technology
(Adapted from [2])
2.2.3. Static CMOS Inverter in Sub-Threshold Operation
Inverter in sub-threshold mode requires the supply voltage VDD to be less than the threshold
voltage, VT to ensure the weak inversion operation for both the NMOS and PMOS transistor of
inverter while maintaining input logic 1 value less than VT [9] and [11]. That ensures the successful
implementation of CMOS inverter in sub-threshold.
Vin Vout
PMOS
NMOS
VDD < VT
Figure 4: CMOS Inverter in Sub-Threshold Operation [11]
Although the sub-threshold inverter implementation is feasible, many researchers expressed
concern on the delay of such logic gates [16], [17], [18] and [19]. The propagation delay of a
symmetric inverter for VDD < VT is stated in expression (7), from where it can be seen that the delay is
strongly depended and inversely proportional to Vdd [11]. On the other hand, dependence on Vdd
of the speed (tpd) of a normal inverter (8) is insignificant. Figure 5 shows the normalised speed for
different supply voltage of an inverter. In the sub-threshold region, the speed decreases at the rate
of 6 times per 100 mv [11].
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
13/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
13
, = (7)
= () (8)
Figure 5 Relative Normalized Speed versus Voltage of a CMOS Inverter [11]
The voltage transfer characteristics (VTC, shown in figure 6) of a static CMOS inverter is
similar for both normal and sub-threshold operation [11]. This is a key fact that makes the sub-
threshold implementation of logic cell possible without any large scale adjustment in design.
Figure 6 Voltage Transfer Characteristics of a AMS 0.35m CMOS inverter for VDD = 1.8 V and
0.3v [2]
2.2.4. Application, Advantages and Demerits of Sub-Threshold Logic
The most important feature of sub-threshold design is that it can offer minimal energy
consumption in electronic circuits. Figure 3 shows that for a small drop in supply voltage, the
consumption of current reduces by a decade [2]. However, such energy efficiency comes at the
expense of performance which is the large propagation delay in circuits. Figure 7 depicts a rough
idea of how speed can be affected by low power. Conventionally design is optimized at Minimum-Delay Operation Point (MDP). When emphasized in power consumption, it can only achieve
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
14/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
14
Minimum Energy Point (MEP). In [8], Markovic states that for 10 times lower energy consumption,
the propagation delay would increase by 1000 times.
Dmin
Emin
Normalis
edEnergy ~ x 1,000
~x
10
Traditional
Operation Region
Suboptimal
Ultralow-Energy
RegionInfeasible
Normalised Delay
MDP
MEP
Figure 7: Energy Delay Trade-off for Minimum Delay Point (MDP) and Minimum-EnergyPoint (MEP) [8]
Dependence of threshold voltage on the temperature along with process is another major
concern for sub-threshold design [18]. For a mere change in temperature, the exponentially
dependent current (5) changes significantly. Therefore, sub-threshold design has to concede
restriction for a primary design parameter such as speed.
On the other hand, sub-threshold design does not require immense amount of design effort
and hence easier to implement. Calhoun and Wang showed in their research that with a slight
modification, a standard cell library using 0.18m technology can operate smoothly in sub-threshold
voltage [9] and [11]. They analysed different process corners TT, SF and FS in order to discover the
lowest working voltage for each process. The result show that all the process can operate in sub-
threshold voltage. However, certain cells in FT process show unstable operation in sub-threshold.
This is because the cells are designed with a longer series of logic gates and a large number of
parallel transistors, as the authors conclude [8] and [25]. In [25], Calhoun and Wang suggested
resizing of transistor for the unsettling cells to achieve stable sub-threshold operation.
Positive outcome from researches [9], [10] and [11] regarding stable operation of standard
CMOS library in sub-threshold voltage is very beneficial for the design process since modern day
digital design process is dependent on cell library synthesis and HDL entry. Therefore, it could be
possible to design a VLSI integrated circuit with minor modification using standard designing
process.
In spite of all the concern regarding speed, temperature and process dependency, a number
of applications implement sub-threshold technique since it offers low power consumption and easier
design process. As mentioned earlier in this chapter that portable applications like mobile phone,
PDA require dynamic range of power and process operation. Ultra-Dynamic Voltage scaling (UDVS) is
used to ensure the low power consumption in such devices for extending the battery life [25]. For
high performance critical operations, it allows devices to run in high voltage or in high frequency.
While in sleep mode, the devices run in sub-threshold voltage to minimize power consumption.
Another major platform of sub-threshold technique exploration is the energy constrained
applications. These applications typically do not require high performance process and strive for low
power consumption. Earlier section of this chapter (section 2.1) exemplifies how these applications
can be benefited from low power consumption which is the primary goal of this project.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
15/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
15
2.3. Pass Transistor Logic (PTL)
In standard CMOS logic circuits all input signals are applied to the gate of both nMOS and
pMOS transistors. When in static mode, the complementary transistors are either in cut-off mode
(high impedance) or in saturation mode (conducting) depending on the input signals state. However,
in pass transistor logic (PTL) the input signals are connected to both drain and source of atransistor [20].
2.3.1. Basic Operations Principle
A popular alternative of conventional CMOS logic is PTL. PTL requires comparatively fewer
number of transistor than CMOS and easier to implement. Figure 8a shows an nMOS transistor
implemented as in PTL AND gate. Source voltage of the transistor is VDD VT [27] and [20]. In
practice, the supply voltage is much bigger than the voltage drop caused by V T and the output
voltage is considered as logic 1. However, it is inadequate to carry out the AND operation for the
arrangement of figure 8a where circuit goes to high impendence state for gate logic 0. Therefore
another nMOS is added to the design (figure 8b) [27] and [20]. The addition of nMOS2 is essential for
the static design since it ensures low impendence path to the supply rail (input rail for PTL) under all
the circumstances provided [27].
VA = VDD
VB = VDD
VY = VDD - VT
Drain Source
Gate
A Y = A.B
B
Drain Source
Gate
A A.B
B
nMOS
nMOS2
Drain Source
Gate
B.nB
nB
nMOS1B
Y = A.B
a) b)
Figure 8 a) Pass Transistor Operation Using Single nMOS b) AND Operation Using Two nMOS
PLT logic makes the design much easier with fewer transistor and variety of logic operations.
Compared to 6 transistor in CMOS implementation, it uses only 2 transistor for the AND operation.
Other logic operations are also achievable with the appropriate change of wiring. Expression (9)
shows the logic function of a PTL AND gate.
VY = VG1 VD1 + VG2 VD2 (9)
A major concern for PTL design is the lower output voltage due to V T drop, as mentioned
earlier. A PTL NAND gate should not be connected the input of another gate [27] for the V T drop atoutput end [27] (figure 9a). The degraded output ultimately becomes insufficient to drive the next
gate. When connected in series, the input signal is degraded for VT drop throughout the chain
(Figure 9b). Therefore, it does not allow a very longer chain connection.
VIN
VDD
VIN VT1
nMOS2nMOS1 nMOS3
VIN VT1VT2
VDD VDD
VIN VT1 VT2 VT3
nMOS1
VDD
VIN VT1
nMOS1
VIN2 VT2VIN2
Figure 9 a) Pass Transistor Output Driving Another Gate b) Degradation of Voltage inPass-Transistor Chain [27]
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
16/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
16
However, this signal degradation can be recovered by using a level restorer buffer (figure
10). Conventionally, a CMOS inverter is used at the end of the chain to restore the signal to logic
values 1 = VDD and 0 = 0V. This added inverter however leads to static dissipation.
A A.B
B
nMOS2
B.nB
nB
nMOS1B
A.B
VDD
Y = n(A.B)
Figure 10 Level Restoration Using CMOS Inverter [27]
An important feature of PLT logic needs to be addressed is that it uses complementary signals for
input signal. In accordance to that fact, a number of design methods have been introduced such as
CPL, LEAP, and Dual PTL.
2.3.2. Complementary Pass-Transistor Logic (CPL)
Complementary Pass Transistor Logic (CPL) is based on the true and complementary signal at
both the input and output end. The operation is based on the discussed PTL AND gate (figure 8b).
The logic is also known as differential pass transistor logic for the complementary outputs. Figure 11
shows AND/NANND, and OR/NOR gate. They follow the same topology with input signal
combinations defining the type of logic operation [20]. Furthermore a XOR/XNOR gate could also be
derived from the same topology.
VDD
A
B
B
nB
nY
VDD
nA
nB
B
nB
Y
VDD
B
A
B
nB
nY
VDD
nB
nA
B
nB
Y
a) b)
Figure 11 Pass Transistor Logic Circuits a) AND/NAND b) OR/NOR
The main feature of CPL is that it offers a simple Full-Adder implementation. Simple design
of XOR/XNOR gate allows to design a 2-input Full-Adder very easily. This Full-Adder is used in this
project and detailed discussion is included in section 2.5.1.
First major publication on CPL implementation was made on 1989 [7]. The researcher from
Hitachi Research Laboratory proposed a 3.8ns CPL multiplier (16x16) in 0.5m technology. It was
reportedly the fastest version of multiplier at the time of publication. The research concluded that
for low static power dissipation and smaller circuit capacitances, CPL is more efficient in terms of
power consumption and speed. When compared to transmission-gate logic (TG), research [21]
shows the similar result in terms of speed efficiency for CPL. However, the study is based on 2-input
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
17/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
17
basic logic cells only.
2.3.3. Dual Pass-Transistor Logic (DPL)
Dual Pass-Transistor Logic (DPL) overcomes the CPL threshold voltage drop when passing
logic 1. Unlike CPL logic which uses a CMOS inverter to overcome the voltage drop, DPL uses pMOS
logic in parallel with nMOS. Figure 12 shows DPL AND/NAND gate. In this approach, the pMOS
transistor passes logic 1 without any threshold loss while logic 0 is passed by nMOS transistor [20].
A
nB
nA
B
A.B
A
nBnA
B
n(A.B)
Figure 12 AND/NAND Logic gate in DPL
Similarly for CPL, DPL offers a very efficient Full-Adder design. Other logic gates such as
OR/NOR and XOR/XNOR could also be designed effectively. Furthermore, the circuit capacitance in
DPL is equally distributed for each output as well as for the inputs [6] and [20]. The researchers in
the project [6] successfully designed a 32-bit ALU based on 0.25m technology and reported that the
ALU is 30% faster than the CMOS version. The research also proposed a carry propagation circuit to
resolve the signal propagation issue which is a major concern for PTL design.
2.3.4. LEAP and Other PTL Styles
Lean Integration with Pass Transistor (LEAP) was introduced in 1996 in [3]. The researchers
successfully developed a smart and small PTL based cell library (7 cells) with a synthesis tool defined
as cell inventor. The main objective of the research was to optimize area, speed and power
optimization in digital design. The outcome of scheme [3] indicates that LEAP obtains all the primary
objectives. Furthermore, LEAP was more cost effective compared to CMOS. Along with 4 different
inverters used to meet the drive requirement, the cell library consists of 3 logic cells Y1, Y2 and Y3
(figure 13). These 3 cells are capable of executing basic logic function with different number of input
signals as necessary. The Y3 cell is used in this project for 4-input MUX which is further discussed in
chapter 3. Further study [22] on LEAP cell-library focused on synthesis algorithm.
A B
C nC
Y
Y1 Y2 Y3
Figure 13 Basic Cells for Logic Operation in LEAP [3]
Further research on PTL technology similarly emphasized on synthesis algorithm of basic
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
18/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
18
cells [5], [23], [24] and [25]. In these projects, a complete cell library was designed using MUX gates
only. The MUX cells adapted the same circuit topology as the Y3 cell of LEAP technology (figure 13).
All the MUX gates were associated with different drive inverters.
2.3.5. Merits and Demerits of PTL
As mentioned earlier that the key benefit of PTL design style is that it requires lower number
of transistor compared to CMOS design [3], [6], and [7] and hence easier to design. Furthermore, PTL
is comparatively power efficient in terms of both static consumption and dynamic consumption.
Ideally, PTL designs do not have a direct path to from power rail to ground rail provided that no
inverters are used. Therefore, no gate current induces which is the main contributor of static power
dissipation. This leads to better speed operation of PTL [3] and [7]. Furthermore, lower number of
transistor leads to reduced dynamic power dissipation. Expression (10) shows the equation of
dynamic power dissipation [14]. PTL designs have lower number of switching nodes and
subsequently lower node capacitance which is why PTL have low dynamic power consumption. As
the PTL devices do not define the drive of the gates, transistor sizes are kept to a minimum which
also lead to lower circuit capacitance and hence lower dynamic dissipation. Moreover, due to
reduced voltage swing, PTL requires low switching energy [27].
= 2Where is switching activity factor (10) [14]
However, PTL design styles require major modification in process technology, and hence the
cost of fabrication increases, since most of the aforementioned researches use specific low
threshold voltage MOS devices [21]. Zimmermann in his research [26] identified that the previous
works on PTL focused developing Full-Adders only which is relatively easier to design in CPL or DPL
compared to least efficient CMOS approach. Furthermore, design topology of PTL requires immense
design effort and layout of such design is complicated as well. In fact the outcome research [26] is
based on the variety of digital application in CMOS which does not thoroughly cancel out the merits
of PTL design.
2.4. Sub-Threshold Pass-Transistor Logic
A number of researches have been conducted on sub-threshold voltage implementation and
pass-transistor logic separately for different parameter optimization such as speed, power
consumption and area. However, there is only a limited amount of research discussing about
combining both the techniques. Most of the researches concentrate on circuit performance in terms
of speed for different design techniques. In [16], Moalemi and Afzali-Kusha examined the
propagation delay dependency on temperature for different sub-threshold PTL design. Speed is a
major concerns for such sub-threshold design. However the result of [16] is not comprehensive sinceit investigated only XOR gates. Moreover, the research ignored the resistive component of input
capacitance for series chain of pass-transistors and carried out the test with ideal load capacitors
only.
Other researches focused on sub-threshold PTL in the perspective of reducing power
consumption. In [19], the researchers analysed a Dynamic Threshold MOS (DTMOS). The gate
terminal of such device is shorted to the body (figure 14). This connection allows the threshold
voltage to change depending on gate voltage values. In this method, however, the threshold voltage
changes along with the supply voltage and hence this approach cannot be categorised as sub-
threshold design. Furthermore, each DTMOS requires their body to be isolated which give rise to
design complexity.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
19/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
19
DTnMOS DTpMOS
Figure 14 DTnMOS and DTpMOS Circuit in DTMOS Mode
As mentioned earlier, that many researchers declared different type of PTL design to be
more energy efficient than CMOS design. Moreover, sub-threshold implementation is capable of
optimizing the design for minimal power consumption. Combination of these two techniques
indicates a substantially power efficient design at the cost of speed. Therefore, sub-threshold PTL
design could be greatly beneficial for self-power energy-constraint application where power is a
scarce resource and performance is not the main concern.
2.5. Basic Circuits
The previous project [2] developed a hierarchical Accumulator-Adder and compared the
power consumption of PTL and CMOS design. Therefore, it created a total of 6 PTL basic circuits and
another 5 CMOS circuits. The following section includes the design details and features of each thebasic circuits from [2].
2.5.1. PTL Logic Circuits
AND/NAND, OR/NOR and XOR/XNOR
Design of these basic circuits is based on CPL method which is discussed earlier in 2.3.2. All
the circuits use to same circuit topology (figure 15). It is the input combinations which determine the
function of the circuits. Because of the differential design, the circuits have complementary inputs
and outputs. It eliminates the necessity of additional inverters which is often a requirement for static
CMOS design. Moreover, the design of XOR and XNOR gate have 4 transistors only which makes the
design very simple compared to their CMOS counterpart. Each design has a level restoring inverterfor recovering voltage level of logic 1 to Vdd. Transistor size of the inverter is selected such that
they provide balanced minimum delay, but at the same time providing sufficient drive [14]. The size
of the pass transistors are kept to minimum since they do not define the drive of the gate. It also
minimizes the circuit capacitance which in turns reduces dynamic power consumption [14].
VDD
nA
nB
B
nB
Y
VDD
nB
nA
B
nB
Y
VDD
A
nB
B
nA
Y
NAND NOR XOR
W=0.4u
L=0.35uW=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
AND
VDD
A
B
B
nB
nY
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
VDD
B
A
B
nB
nY
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
OR
VDD
nY
nB
A
nA
B
W=0.4u
L=0.35uW=0.4u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
XNOR
a) b) c)
Figure 15 PTL Basic circuits a) AND, NAND b) OR and NOR and c) XOR and XNOR
D-Type Flip Flop
The design of D-type flip flop [2] is based on the proposed version by Hsiao in [4] with a
slight modification on the circuit component. Figure 16 shows the flip flop designed in [2].
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
20/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
20
VDD VDD VDD
D
nClock
nReset
Clock Q
nQ
W=0.4u
L=0.35uW=3.3u
L=0.35u
W=0.85u
L=0.35u
W=0.4u
L=0.35u
W=1u
L=0.35u
W=0.4u
L=0.35u
VDD
W=3.3u
L=0.35u
W=0.85u
L=0.35u
W=3.3u
L=0.35u
W=0.85u
L=0.35u
Figure 16 Resettable D-Type Flip Flop Based on PTL
The designer in [2] modified the original design for using the flip flop in sub-threshold
voltage. The feedback pMOS transistor used in [4] for better performance of inverter and speed
increment, was removed. This is because the author [2] claimed that pMOS caused the inverter to be
in permanent pull-up mode in sub-threshold and hence the circuit was not operational. Moreover,
the circuit in [4] has pMOS and nMOS clock transistors. The author [2] observed that pMOS caused
significant delay in the circuit causing inappropriate non-synchronous operation of the circuit.Therefore, the pMOS clock transistor was replaced by the nMOS transistor which enables the edge
triggering of flip-flop. Moreover, the whole project was inspired by nMOS pass-transistor, the author
claimed [2]. The transistor size in the flip flop is same as the other basic logic circuits.
2-Input Multiplexer
The 2-input multiplexer is a very simple circuit consisting of 2 nMOS transistor. This is the
most commonly used multiplexor in PTL method, especially in CPL and LEAP. The inputs of the circuit
are controlled by the complementary control signal Load and nLoad. This multiplexer is used with
the D-type flip flop to design a load register (figure 17). According to the author, no level restoring
inverter is used with the multiplexor because it is loaded with small capacitance from D-type.
MUX2
D
Q
nLoad
Load
D
ClocknClocknReset
Q
nQ nQ
ClocknClocknReset
Q
DTYPE
W=0.4u
L=0.35u
W=0.4u
L=0.35u
Figure 17 PTL Load Register using 2-Input MUX and D-Type Flip Flop
Load Register
As mentioned earlier that the register is designed with connecting the 2-input multiplexor
with the D-type flip flop as shown in figure 17. When the Load signal is enabled (logic 1), the
register updated with value from input signal, D otherwise it retains the value from previous stage.
Full -Adder
PTL Full-Adder is one of the major benefits of PTL based design because it is easier to design
with effective circuit functionality. Figure 18 shows a classic Full-Adder circuit is based on PTL
AND/NAND, OR/NOR and XOR/XNOR circuits. It appeared in a number of publications [26], [20] and
[7] and was analysed successfully. Moreover, the publications also concluded that this PTL version is
faster and more energy efficient than any other CMOS version. With all input signals
being differential, the Full-Adder can provide complementary output of sum signal S and
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
21/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
21
carry-out signal Cout.
B
nA
A
nB
W=0.4u
L=0.35u
W=0.4u
L=0.35u
nA
W=0.4u
L=0.35u
W=0.4u
L=0.35u
Cin
B
nCin
A
W=0.4u
L=0.35u
W=0.4u
L=0.35u
nA
W=0.4u
L=0.35u
W=0.4u
L=0.35u
B
nCin
Cin
W=0.4u
L=0.35u
W=0.4u
L=0.35u
nCin
W=0.4u
L=0.35u
W=0.4u
L=0.35u
Cin
nB
nCin
A
W=0.4u
L=0.35u
W=0.4u
L=0.35u
nA
W=0.4u
L=0.35u
W=0.4u
L=0.35u
nB
nBB
nA
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35u
nA
A
Cout
nCout
S
nS
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Figure 18 PTL Based Full-Adder
2.5.2. CMOS Logic Circuits
AND Gate
Figure 19 shows the classic CMOS logic circuits for 2-input AND gate. The AND gate includes
a classic CMOS inverter. Transistor sizes of the inverter are kept same as the ones used in PTL design.
It allows comparing of the CMOS structures with their PTL counterparts under realistic condition. In
fact, all the CMOS circuits except the D-type flip flop have the same size of nMOS and pMOS
transistor as the inverter. This is because the ratio of pMOS to nMOS transistor from 1.4 to 2 is
proven to provide minimum delay and sufficient drive [14].
VDD VDD
W=3.3u
L=0.35u
W=3.3u
L=0.35u
W=1.85u
L=0.35u
W=1.85u
L=0.35u
B
A
B
VDD
W=3.3u
L=0.35u
W=0.85u
L=0.35u
Y
Figure 19 Classic CMOS 2Input Logic Circuit for AND Gate
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
22/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
22
D-Type Flip Flop
The CMOS version of D-type flip flop is shown in figure 20. The circuit has a reset input signal
(nReset) and it is triggered at the rising edge of clock cycle which is similar to its PTL counterpart. The
circuit consists of six NAND gates with three 2-input gate, two 3-input gate and one 4-input gates.
The design of flip-flop is an optimized style of a typical Master-Slave circuit [28]. Although the input
signal nD is inverting, the output signal is differential. In typical design approach, the pMOStransistor is bigger in size compared to the nMOS transistor. This is, however was not operational in
sub-threshold since the circuit did not respond to the positive edge of Clock signal which was
reported in [2]. The researchers in [9] also reported similar incident for sub-threshold voltage and
suggested resizing of the flip flop with nMOS transistors bigger in size than the pMOS ones.
Therefore the transistor were resized as shown in figure 20 (Wp= 1.85 um and Wn= 3.3 um) and the
flip flop was observed to be operational at the positive edge of Clock signal [2].
nD
nReset
Clock
Q
nQ
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Wp=1.85u
Wn=3.3u
Figure 20 CMOS D-type Flip Flop with Master-Slave Configuration [28]
2-Input Inverting Multiplexer
The two-input multiplexor circuit in CMOS design is shown in figure 21a. The input signal
nLoad and the output signal nD are inverting which compensate for the inverting input of D-type
flip flop. This inverting output, however discard the use of additional inverter at the output when
required for circuit operation.
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=3.3u
L=0.35u
W=3.3uL=0.35u
VDD
Load
Q
Q
nLoad
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=3.3u
L=0.35u
W=3.3uL=0.35u
VDD
D
D
nLoad
Load
VDD
W=3.3uL=0.35u
W=0.85u
L=0.35u
LoadnLoad
Q
D
Load
nD
a b
Figure 21 a) 2-Input CMOS Multiplexer Circuit with Inverting Output b) Circuit Symbol [28]
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
23/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
23
Load register
The design of CMOS load register is similar to the PTL version with slight modification. Figure
22 shows that the Load Register uses inverting multiplexer in order to compensate for the inverting
input of the modified version of D-type flip flop, as mentioned earlier. The operation of the register
is similar, with the input signal D being stored at the positive edge of Clock signal.
D
Load
nDD
Clock
nReset
Clock
nReset
QQ
nQ nQ
MUX2 DTYPE
Figure 22 CMOS Load Register
Full-Adder
The Full-Adder circuit shown in figure 23 is a classic version of CMOS design. Although itrequires a total of 28 transistors, it is the most optimized version in terms of performance and the
number of transistor required [14], [20] and [26]. Transistor size ratios are maintained as similar to
basic logic circuits which are 3.3um/0.35um for pMOS and 1.8um/0.35um for nMOS.
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=3.3u
L=0.35u
W=3.3u
L=0.35u
VDD
A
Cin
Cin
A
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=3.3u
L=0.35u
W=3.3u
L=0.35u
VDD
A
A
B
B
VDD
B
B
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=3.3u
L=0.35u
W=3.3u
L=0.35u
VDD
A
A
VDD
B
B
W=1.85u
L=0.35u
W=1.85u
L=0.35u
VDD
Cin
W=1.85u
L=0.35u
Cin
VDD
A
Cin
B
Cin
A
B
W=3.3u
L=0.35u
W=3.3u
L=0.35u
W=3.3u
L=0.35u
W=1.85u
L=0.35u
W=1.85u
L=0.35u
W=1.85u
L=0.35u
VDD
W=3.3u
L=0.35u
W=0.85u
L=0.35u
W=1.85u
L=0.35u
VDD
W=3.3uL=0.35u
W=0.85u
L=0.35u
S
Cout
Figure 23 CMOS Full-Adder Circuit with Transistor Sizes [26]
2.6. Arithmetic Logic Unit (ALU)
Arithmetic Logic Unit (ALU) is one of the fundamental building blocks of a typical
microprocessor. The ALU performs both the arithmetic and logic functions. Therefore, it consist of
basic functional components like Adder, AND, OR, XOR gates and others. Each functional component
can offer one type of operation. For example, the adder in an ALU performs the add operation.
However, combination of multiple units is also required for a few specific operations such as
subtraction operation which requires both XOR gate and Adder for carrying out the calculation.
2.6.1. ALU Design
This project goal is to develop ultra-low power ALU. Therefore, the design of ALU is
influenced by low power implementation. However, there are many approaches to reduce the
power consumption in ALU or in general, the digital circuits. At the low level design, transistor sizing
method is used to minimize circuit capacitance. Technology mapping is another process at the logic
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
24/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
24
gate level. Different algorithms have been developed for different ALU architecture targeted for
power reduction. At the system level and register transfer level (RTL), power gating and clock
gating and are two popular techniques. Among the other popular techniques, Dynamic Voltage
Scaling (DVS) is widely used in portable devices, which is discussed on chapter 2.
Another possible approach is structural level customization. A numbers of customizations
have been proposed and implemented for performance enhancement of digital design. However,most of the projects such as University of Illinois Illiac 2 project, IBM Stretch Project and [29]
emphasized on performance. On the other hand, a few researches [30] and [31] have proposed
structural level power minimization techniques.
There are two basic methods for structural design of ALU which are chain method and tree
method. Following section includes the brief description of the two techniques.
2.6.1.1. Tree Structure
In tree structures, functional components are connected in parallel with a multiplexer.
Figure 24 shows an ALU with Adder, AND, OR and NOR gate connected in parallel through a 4-input
multiplexer (MUX). Depending on the value of MUX control signal, the ALU output is determined
from the results of all the functional components.
ADDER
OR
MUX4
A
B
A
B
D1
D2
D3
D4
A
B Y
Y
Q Q
S0 S1
ANDA
BY
XOR
A
B Y
Figure 24 Tree Structure Design [30]
This structure requires more area. Furthermore routing of signals is complicated which
makes the layout difficult. However, the circuit operation is faster.
2.6.1.2. Chain Structure
In chain structure the larger multiplexer is replaced by a chain of smaller multiplexers
typically with 2input MUX (figure 25). The first stage of the chain starts with two arbitrary
functional components with outputs connected with the first MUX. The MUX output is then
connected to one of the two inputs of next stage MUX. The other input is occupied by another
functional component output (figure 25). Due to the concatenation, some of the component outputs
have to travel longer transmission path.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
25/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
25
MUX2
ADDERA
BD
A
B Y D
Q
S0
Y
ANDA
B
A
B Y
ORA
B
A
B Y
MUX2
DD
Q
S1
Y
XORA
B
A
B Y
S2
MUX2
DD
Q
YQ
Figure 25 Chain Structure Design [30]
The chain a structure requires smaller area for design. Moreover, chain structures offer
variety of ways for component placement. This in turn can be utilized to reduce power by placingfrequently functional component closer to the output. However, circuit operation is relatively slower
compared to tree connection because of the chain structure.
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
26/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
26
CHAPTER 3 BASIC CIRCUITS DESIGN AND CHARACTERISATION
In order to achieve the project goal, it was essential to develop a sub-threshold ALU both in
PTL and CMOS logic and compare the two designs in terms of power consumption. However, the
basic circuits available from the previous project [2] were inadequate for designing a large
hierarchical circuit block like ALU. Therefore, an additional of 8 basic CMOS logic circuits and 1 PTLcircuit were designed. This chapter includes the design details, functionality and characterisation of
the additional circuits.
3.1. Design
All the design work in this project was carried out in the Cadence AMS 0.35m process. This
technology is chosen specifically for two reasons. Firstly, the Spectre simulator included in this
process can provide very detailed and precise simulation on analogue circuits with user friendly
interface. Most importantly it can characterise the MOS devices from its own library for sub-
threshold operation. Secondly, this technology is well known and has been widely used for years in
custom processor design, while providing cost effective solution for such complex design.
3.1.1. PTL Circuit
4-Input Multiplexor
A 4-input multiplexor is an essential part of digital circuit blocks. Figure 26 shows a PTL 4-
input multiplexor (MUX4). The size of the transistors which is also shown in figure 26, are kept same
as the other PTL circuits. This circuit is adapted from the Y3 circuit of LEAP (Lean Integration with
Pass-Transistor) technology [3] which was discussed previously in section 2.3.4. The Y3 circuit is a
generic PTL logic circuit which can be utilised for multiple logic operations with different input signal
combinations. The proper combination of complementary input control signals (Load1, nload1,
Load2 and nLoad2) enables the circuit to operate as MUX4 for the data input signals (D1, D2, D2 andD4). Since the output of the Y3 circuit is inverted, an additional inverter is added to the output to
generate the non-inverted output signal. Moreover, analysis showed that, without the additional
inverter, the output of the Y3 circuit is degraded for sub-threshold supply. Transistor sizes of the
inverters are explained in the following section of this chapter. PTL transistor size is kept same as the
other basic circuits.
D1 D2 D3 D4
Load1
Load2
nLoad1
nLoad2
Y
W=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35uW=0.4u
L=0.35u
W=0.4u
L=0.35u
W=0.4u
L=0.35u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Figure 26 A 4-Input Multiplexor with Transistor Sizes in LEAP Technology [3]
3.1.2. CMOS Circuits
Inverter
In the previous project, inverter was used as an integrated part of logic circuits such as AND
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
27/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
27
and Full-Adder circuits. However, there was no separate circuit designed and characterised for
inverter operation. Moreover, for larger design blocks, inverter is used extensively. Therefore, an
inverter circuit was designed and characterised (figure 27). Transistor sizes of this inverter are same
as the ones used in other basic circuits and as explained earlier that this set of transistor size can
provide balanced minimum delay and yet sufficient drive [14]. In addition, the project [2] used
inverter with the same transistor ratio in sub-threshold without any flows being reported. Both thePTL and CMOS circuits use this same inverter.
A
VDD
W=3.3u
L=0.35u
W=1.85u
L=0.35u
Y
Figure 27 CMOS Inverter with Transistor Size
AND, OR and NORThe circuits of figure 28 shows a classic 2-input CMOS design for AND, OR and NOR gates
with the transistor sizes used in this project and the previous one [2]. AND gate is derived from the
previously designed NAND gate [2] by adding an inverter to it. The sizes of the transistor in these
circuits are kept same as the inverter used in PTL circuits for comparing the CMOS design with its PTL
counterparts.
Y
A B
A
B
W=3.3u
L=0.35u
W=3.3u
L=0.35u
W=1.85uL=0.35u W=1.85u
L=0.35u
VDD
VDD
W=3.3u
L=0.35u
W=1.85u
L=0.35u
Y
VDD VDD
W=3.3u
L=0.35u
W=3.3u
L=0.35u
W=1.85uL=0.35u
W=1.85u
L=0.35u
A B
A
B
VDD
W=3.3u
L=0.35u
W=1.85u
L=0.35u
Y
a
A B
A
B
W=3.3u
L=0.35u
W=1.85uL=0.35u W=1.85u
L=0.35u
VDD
Y
b c
Figure 28 Classic CMOS Logic Circuits with Transistor Sizes a) AND Gate b) OR Gate and c) NOR Gate
XOR and XNOR
The design of these two logic circuits are based on NAND gates which a classic method in
CMOS process. Figure 29 shows the symbol diagram of two logic circuits. Similarly to other CMOS
circuits, the transistor sizes are maintained accordingly.
A
B
Q
A
B
Q
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85uWp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
a b
Figure 29 Symbol Diagram of Classic CMOS Logic Circuits a) XOR Gate b) XNOR Gate
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
28/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
28
4-Input Multiplexor
The 4-input multiplexor is designed using three 2-input multiplexor cascaded to each other
as shown in figure 30. The 2-input MUXs are adopted from the project [2]. As mentioned earlier, the
2-input MUX has inverting output. However, for the circuit connection shown in figure 30, it can be
observed that the signal travels from input to output through 2 inverting MUXS and hence the signal
obtained at output is noninverting. This in turn eliminates any requirement for additional inverter
to obtain a non-inverting output signal.
D2
D1
S0
D4
D3
S0
S1
Q
MUX2
MUX2
MUX2
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Wp=3.3u
Wn=1.85u
Figure 30 A 4-Input CMOS Multiplexor Designed from 2-Input CMOS Multiplexors with
Transistor Sizes
Tri-State BufferThe circuit of a tri-state buffer with transistor sizes is shown in figure 31 [35]. Transistor sizes
are maintained as the other CMOS logic circuits. When enable signal EN is at logic 1, the output
signal has the input signal values of logic 0 and logic 1. On the other hand when EN is set to logic
0, the output is in high-impedance state.
W=1.85u
L=0.35u
W=1.85u
L=0.35u
OUT
W=3.3u
L=0.35u
W=3.3u
L=0.35u
Wp=3.3u
Wn=1.85u
IN
EN
Figure 31 Tri-State Buffer with Transistor Sizes [35]
3.2. Characterisation
The basic logic circuits designed in this project and the other basic circuits from the previous
project [2] were characterized in terms of propagation delay and power consumption (static and
dynamic) under different simulation conditions such as different ambient temperatures, supply
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
29/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
29
voltages and different fan-outs. However, the simulations were mainly carried out on PTL basic
circuits and only a few CMOS circuits were characterised. This is because the project intended for
designing complex and practical circuits based on PTL method only. Moreover, CMOS basic circuits
were already characterised in the project [2] and this project was planned to avoid repeating of
previous project works on CMOS circuits [2] and to progress further towards the ultimate goal of
ultra-low power custom processor design. However, ALU modules (both 1-bit and 8-bit versions)were designed and simulated for PTL and CMOS version to distinguish the difference in power
consumption which is discussed in chapter 4. All the supply voltages used for simulation were in the
sub-threshold region except for 0.6V. Figure 32 shows the simulation tree for the characterisation of
the PTL basic circuit under different simulation condition.
Delay TestPower
Consumption Test
FO = 0
FO = 4
FO = 0
FO = 4
FO = 0
FO = 4
-20 C 27 C 85 C
Static Dynamic
FO = 2 FO = 2 FO = 2
FO = 1
FO = 4
FO = 1
FO = 4
FO = 1
FO = 4
-20 C 27 C 85 C
FO = 1
FO = 4
FO = 1
FO = 4
27 C 85 C
Simulation on Supply
Voltages of 0.3V, 0.4V,
0.5V and 0.6V
PTL Cells:
AND/NAND, OR/NOR,
XOR/XNOR, Load Register
CMOS Cells: Inverter
PTL Cells:
AND/NAND, OR/NOR, XOR/XNOR,
MUX4, Full Adder, Load Register
CMOS Cells:
Inverter, Tri-State Buffer
Figure 32 Simulation Tree Diagram for Characterisation of Basic PTL Circuits (Adapted from [2])
3.2.1. Propagation Delay Measurement
Propagation delay is an important design characteristic of a logic circuit. For design and
validation purpose, this parameter must be available to the design engineer. In this project,
propagation delay was measured for PTL logic circuits such as AND/NAND, OR/NOR, XOR/XNOR and
load register and the only CMOS logic circuit that was simulated was inverter. These 5 circuits were
used frequently for designing of other circuits.
The propagation delay of a circuit is defined as the average value (11) of delay at the rising
edge and the falling edge of the output signal [14] figure 33b. Figure 33a shows a generalised test
circuit for delay measurement with different fan-out. Measurement of the fan-out 0 circuitprovides the parasitic delay of a logic circuit. For comprehensive characterisation, tests were carried
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
30/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
30
out for fan-out 1 and 4. Fan-out 4 circuits were specifically used because digital logic circuits show
realistic characteristics for a minimum of 4 fan-out connection. Supply voltages used for the
simulations were 0.3V, 0.4V, 0.5V and 0.6V. Except for 0.6V, all the other supply voltages are sub-
threshold voltages which is the major simulation variable for this project. Another important variable
is temperature which influences the performance of circuits. It has already been discussed in chapter
2 that sub-threshold circuits have strong dependence on temperature [18]. All the test circuits wererun at three different temperatures which are -20C, 27C and 85C. This temperature range (-20C
to 85C) does not necessarily cover all the operational temperatures but is wide enough to examine
the temperature effect in sub-threshold. Figure 32 shows the simulation tree diagram used for basic
circuits.
= ()+()2 (11)
Pulse Stimulus
Vin Vout
FO = 0, 2 and 4
Delay
At Fall
Delay
At Rise
50% Vdd 50% Vdd
50% Vdd50% Vdd
a b
Vin
time
time
Vout
Figure 33 a) Generalized Simulation Setup for Propagation Delay Measurement b) Definition of
Propagation Delay (Adapted from [2])
3.2.2. Power Consumption Measurement
Power consumption measurement is the most important simulation procedure in this
project since this project objective is the design of an ultra-low power system. Static and dynamic
consumptions were measured for both PTL and CMOS logic circuits. Similarly, to delay
measurement, this procedure was also focused on PTL circuits such as AND/NAND, OR/NOR,
XOR/XNOR, Full-Adder and Load Register. Only two CMOS logic circuits - inverter and tri-state buffer
were characterised. The simulations were carried out for different fan-out and temperature with
different supply voltages as shown in figure 32. A generalised test circuit is shown in figure 34. The
circuit under the test was powered by an external independent source and the current drained from
this source was measured.
Static Power Consumption
A logic circuit is said to be in static mode when the input signal does not change its state.
The main sources of static power dissipation are gate leakage (tunnelling of the electron through
gate oxide), reverse-biased junction leakage (diode leakage between diffusion region) and sub-
threshold conduction (due to carrier diffusion for supply voltage smaller than threshold voltage)
[14]. Since the project is designed for sub-threshold supply, static power dissipation would be a
significant source of power consumption. Moreover, it will be more prominent since the PTL circuits
are slow and require longer processing time. It has already been mentioned in chapter 2 that the
level restoring inverter used in PTL causes static power dissipation due to shot-circuit conduction.
Most of the PTL circuits in this project use this inverter and therefore the shot-circuit dissipation is
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
31/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
31
also a major source of static dissipation in this project.
Static power dissipation is modelled with (12) for both PTL and CMOS circuits. Here,
measured current IDD is multiplied by the supply voltage VDD to derive static power consumption. The
current is measured directly from simulation result. Figure 32 shows the different simulation
conditions for a circuit under the test.
= () (12)Dynamic Power Consumption
Dynamic power dissipation is produced by the energy drawn from the supply for charging
and discharging of a logic circuit output node capacitance. Therefore, energy consumption depends
on the rate of change of state for output signals.
= 1 21
()(13)
Measurement of dynamic consumption can be modelled with expression (13). Similar tostatic measurement, the average value of the current can be measured directly from simulation
result. The Spectre simulator provides the average current for a particular time of simulation, which
is then multiplied by the circuit Vdd to obtain the dynamic dissipation. However, simulations for the
dynamic dissipation were carried out for two temperature values (27C and 85C) as shown in figure
32.
Pulse Stimulus
Vin Vout
Cell Under Test
Vdd
Vdd Test Vdd
Average Current
Measurement
Figure 34 General form of Simulation Circuit used for Current Measurements to Extract Static and
Dynamic Power Dissipation (Adapted from [2])
3.3. Presentation of Results
3.3.1. Propagation Delay
The results of propagation delay measurement of PTL logic circuits AND/NAND, OR/NOR,
XOR/XNOR, Load Register and CMOS Inverter are presented in the table 1. As shown in the
simulation tree diagram of figure 32, the simulations were carried out for three different fan-out
circuits (0, 2 and 4). Each fan-out circuit, however was tested under 3 different temperature - -20 C,
27C and 85 C. Supply voltages used for simulations were 0.3V, 0.4V, 0.5V and 0.6v.
After a thorough observation of the data in table 1, propagation delay characteristics can be
summarized into three different aspects. Firstly, the delay is strongly dependent on temperature.
For a very low temperature of -20 C, all the circuits have thousands of microseconds of delay.
However, increment in temperature shows that the delay improves by 10 times to more than 1000times. For example, for 0.4V Vdd and FO = 4, the AND/NAND gate has a delay of 524.91s at -20C,
-
7/23/2019 Arithmetic and Logic Circuits Using Sub-Threshold Pass-Transistor Logic For Ultra-Low Energy Applications
32/60
C.M.S Ul Haque Salmee MSc in Microelectronics Systems Design September 2012
University of Southampton
32
whereas for 85C the delay reduces significantly to 2.53s. The second aspect of delay is the
influence of supply voltage. In the deep sub-threshold region all the logic circuit has a very low
speed. As the Vdd increases gradually towards the circuit threshold, the speed of the circuit
increases immensely. The other aspect of delay is the fan-out of circuits. For larger fan-out, the
top related