azizi mazreah-5t

4
A Novel Zero-Aware Read-Static-Noise-Margin-Free SRAM Cell for High Density and High Speed Cache Application Arash Azizi Mazreaht, Mohammad Taghi Manzuri Shalmani 2, Reza Noormande, Ali Mehrparvar 4 Islamic Azad University, Sirjan Branch 1 ,3,4, SharifUniversity ofTechnology2 Email: [email protected] 1 [email protected] 2 [email protected] 3 [email protected] 4 Abstract To help overcome limits to the density and speed of conventional SRAMs, we have developed a five-transistor SRAM cell. The newly developed CMOS five-transistor SRAM cell uses one word-line and one bit-line during read/write operation. This cell retains its data with leakage current and positive feedback without refresh cycle. The new cell size is 18% smaller than a conventional six-transistor SRAM cell using same design rules. Simulation result in standard 0.25J.1m CMOS technology shows purposed cell has correct operation during read/write and idle mode. The average delay of new cell is 20% smaller than a six-transistor SRAM cell. 1. Introduction Advances in process technologies are leading to steady increases in the speeds of microprocessors and system LSIs [I]. The on-chip caches can effectively reduce the speed gap between the processor and main memory, almost modem microprocessors employ them to boost system performance. These on-chip caches are usually implemented using arrays of densely packed SRAM cells for high performance [2]. A six-transistor SRAM cell (6T SRAM cell) is conventionally used as the memory cell [3]. However, the 6T SRAM cell produces a cell size an order of magnitude larger than that of a DRAM cell, which results in a low memory density [3]. Meanwhile, the 6T SRAM cell stability is further degraded by supply voltage scaling and the read operations at low-VDO levels result in storage data destruction in SRAM cells [I]. Therefore, conventional SRAMs that use the 6T SRAM cell have difficulty meeting the growing demand for larger memory capacity and high speed access in cache applications [3]. In response to this requirement, our objective is to develop an SRAM cell with five transistors to reduce the cell area size with performance improvement. In this paper, we describe a novel zero-aware read-static-noise-margin-free five transistor SRAM cell (5T SRAM cell). In designing of this new cell we exploit the strong bias towards zero at the bit level exhibited by the memory value stream of ordinary programs. Novel cell retains its data with leakage current and positive feedback without refresh cycle. The new cell size is 18% smaller than a conventional 6T cell using same design rules and average delay access of a cache with new 5T SRAM cell is 20% smaller than a cache with 6T SRAM cell. 2. Read static noise margin and SRAM cell current in conventional SRAM cells The SRAM cell current and read static noise margin (SNM) are two important parameter of SRAM cell. The read SNM of cell shows the stability of cell during read operation and SRAM cell current determine the delay time of SRAM cell. Fig.1 shows the SRAM cell current of the conventional SRAM cell. Although SRAM cell current degradation simply increases bit-line (BL) delay time, Read SNM degradation results in data destruction during Read operations [I]. Both Read SNM and SRAM cell current values are highly dependent on the driving capability of the access NMOS transistor: Read SNM decreases with increases in driving capability, while SRAM cell current increases. That is, the dependence of the two is in an inverse correlation [I]. Thus in conventional SRAM cell the read SNM of cell and cell current cannot adjust separately. Word-Une - . Figure I. SRAM cell current in 6T SRAM cell One strategy for solving the problem of inverse correlation between SRAM cell current and read SNM is separation of data retention element and data output element means that there will be no correlation between Read SNM and SRAM cell current. Base on this strategy [4] presents a dual-port SRAM cell. But this cell is composed of eight transistors and has 30% greater area than that of a conventional 6T SRAM cell [I]. Another strategy is loop-cutting during read operation. Base on this strategy in [I] a read-static-noise-margin-free SRAM cell for low-V oo and high speed application presented. This cell is composed of seven transistors and 978-1-4244-2186-2/08/$25.00 ©2008 IEEE

Upload: tej1985

Post on 04-Oct-2014

31 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: azizi mazreah-5T

A Novel Zero-Aware Read-Static-Noise-Margin-Free SRAM Cell forHigh Density and High Speed Cache Application

Arash Azizi Mazreaht, Mohammad Taghi Manzuri Shalmani 2, Reza Noormande, Ali Mehrparvar4

Islamic Azad University, Sirjan Branch1,3,4, Sharif University ofTechnology2

Email: [email protected]@[email protected]@gmail.com4

Abstract

To help overcome limits to the density and speed ofconventional SRAMs, we have developed afive-transistor SRAM cell. The newly developed CMOSfive-transistor SRAM cell uses one word-line and onebit-line during read/write operation. This cell retains itsdata with leakage current and positive feedback withoutrefresh cycle. The new cell size is 18% smaller than aconventional six-transistor SRAM cell using samedesign rules. Simulation result in standard 0.25J.1mCMOS technology shows purposed cell has correctoperation during read/write and idle mode. The averagedelay of new cell is 20% smaller than a six-transistorSRAM cell.

1. IntroductionAdvances in process technologies are leading to steadyincreases in the speeds of microprocessors and systemLSIs [I]. The on-chip caches can effectively reduce thespeed gap between the processor and main memory,almost modem microprocessors employ them to boostsystem performance. These on-chip caches are usuallyimplemented using arrays of densely packed SRAMcells for high performance [2]. A six-transistor SRAMcell (6T SRAM cell) is conventionally used as thememory cell [3]. However, the 6T SRAM cell producesa cell size an order of magnitude larger than that of aDRAM cell, which results in a low memory density [3].Meanwhile, the 6T SRAM cell stability is furtherdegraded by supply voltage scaling and the readoperations at low-VDO levels result in storage datadestruction in SRAM cells [I].Therefore, conventional SRAMs that use the 6T SRAMcell have difficulty meeting the growing demand forlarger memory capacity and high speed access in cacheapplications [3]. In response to this requirement, ourobjective is to develop an SRAM cell with fivetransistors to reduce the cell area size with performanceimprovement.In this paper, we describe a novel zero-awareread-static-noise-margin-free five transistor SRAM cell(5T SRAM cell). In designing of this new cell we exploitthe strong bias towards zero at the bit level exhibited bythe memory value stream of ordinary programs. Novelcell retains its data with leakage current and positivefeedback without refresh cycle. The new cell size is 18%

smaller than a conventional 6T cell using same designrules and average delay access of a cache with new 5TSRAM cell is 20% smaller than a cache with 6T SRAMcell.

2. Read static noise margin and SRAM cell current inconventional SRAM cellsThe SRAM cell current and read static noise margin(SNM) are two important parameter of SRAM cell. Theread SNM of cell shows the stability of cell during readoperation and SRAM cell current determine the delaytime of SRAM cell. Fig.1 shows the SRAM cell currentof the conventional SRAM cell. Although SRAM cellcurrent degradation simply increases bit-line (BL) delaytime, Read SNM degradation results in data destructionduring Read operations [I]. Both Read SNM and SRAMcell current values are highly dependent on the drivingcapability of the access NMOS transistor: Read SNMdecreases with increases in driving capability, whileSRAM cell current increases. That is, the dependence ofthe two is in an inverse correlation [I]. Thus inconventional SRAM cell the read SNM of cell and cellcurrent cannot adjust separately.

Word-Une

­.Figure I. SRAM cell current in 6T SRAM cell

One strategy for solving the problem of inversecorrelation between SRAM cell current and read SNM isseparation of data retention element and data outputelement means that there will be no correlation betweenRead SNM and SRAM cell current. Base on this strategy[4] presents a dual-port SRAM cell. But this cell iscomposed of eight transistors and has 30% greater areathan that of a conventional 6T SRAM cell [I]. Anotherstrategy is loop-cutting during read operation. Base onthis strategy in [I] a read-static-noise-margin-freeSRAM cell for low-Voo and high speed applicationpresented. This cell is composed of seven transistors and

978-1-4244-2186-2/08/$25.00 ©2008 IEEE

Page 2: azizi mazreah-5T

makes it possible to reduce the area overhead from 300/0to 13% [1]. Thus this cell is with area overhead too.To avoid inverse correlation between SRAM cell currentand read SNM we proposed new five transistor SRAMcell. Our proposed cell is base on loop-cutting strategyand this observation that in ordinary programs most ofthe bits in caches are zeroes for both the data andinstruction streams. This new cell making it possible toachieves both low-VDD and high-speed operations withno area overhead.

3. Cell Deign ConceptFig. 2 shows a circuit equivalent to a developed 5TSRAM cell using a supply voltage of 2.5V in standard0.25Jlm CMOS technology.

(W/L)Ml=1.5~m/O.25~m (W/L)M2=1.25~m/O.25~m

(W/L)M3=O.75~m/O.25~m (W/L)M4=O.5~m/O.25~m

(W/L)M5=O.5~m/O.25~m

Figure 2. New 5T SRAM cell in 0.25Jlm

During idle mode of cell (when read and write operationdon't perform on cell) the feedback-cutting transistor(M5) is ON and N node pulled to VDD• When '1' stored incell, M3 and M2 are ON and there is positive feedbackbetween ST node and STB node, therefore ST nodepulled to VDO by M2 and STB node pulled to GND byM3. When '0' stored in cell M4 is ON and since N nodemaintained at VDO by M5 the STS pulled to VDO, alsoM2 and M3 are OFF and for data retention withoutrefresh operation following condition must be satisfied.

1DS-Ml + 1gate-M3 > ISD-M2 + I gate-M 4

For satisfying above condition when '0' stored in cell, weuse leakage current of access transistors (M1), especiallysub-threshold current of access transistors (MI). For thispurpose during idle mode of cell, bit-line maintained atGND and word-line maintained at V1d1e• Fig. 3 showsleakage current of cell during idle mode for dataretention when ·0' stored in cell. Most of leakage currentof access transistor (M 1) is sub-threshold current, sincethis transistor maintained in sub-threshold condition.Simulation result in standard 0.25Jlm CMOS technologyshows if during idle mode of cell, bit-line maintained atGND and V1d1e maintained at O.5V then ·O'data stored incell without refresh cycle and thus in idle mode abovecondition satisfied.

I DS-Af1-- I SD~AI 2

I gate - )\/ 4 - - I gate-AI 3 - · - ·Figure 3. Leakage current in idle mode when ·0' stored

in cell4. Read and Write Operation

During write operation feedback-cutting transistor isON and N node pulled to VDD by this transistor thus inwrite operation read-line maintained at GND and when awrite operation is issued the memory cell will go throughthe following steps. I)-Sit-line driving: For a write, dataplaced on bit-line, and then word-line asserted to VOD.

2)-Cell flipping: this step includes two states as follows.a)-Data is zero: in this state, ST node pulled down toGND by NMOS access transistor (M 1), and therefore theLoad transistor (M4) will be ON, and STS node will bepulled up to VDD. b)-Data is one: in this state, ST nodepulled up to VOD-VTN by NMOS access transistor (M1),and therefore the drive transistor (M3) will be ON ,and STS node will be pulled down to GND, thus loadtransistor (M2) will be ON and positive feedback createdby M2 and M3. 3)-Idle mode: At the end of writeoperation, cell will go to idle mode and word-line andbit-line asserted to V1d1e and GND, respectively.

When a read operation is issued the memory cell willgo through the following steps. I)-Sit-line discharging:For a read, bit-line discharged to GND, and then floated.2)-feedback-cutting: In this step feedback-cuttingtransistor is OFF and thus read-line maintained at VOD

during read operation. 3)-Word-line activation: in thisstep word-line asserted to VDD, and two states can beconsidered: a)-Voltage of ST node is high: when voltageof ST node is high, the voltage of bit-line pulled up tohigh voltage by NMOS access transistor. We refer to thisvoltage of bit-line as VBL-High. b)-Voltage of ST node islow: when voltage of ST node is low, the voltage ofbit-line and ST node equalized. 3)-Sensing: Afterword-line deactivate to V1d1e then sense amplifier isturned on to read data on bit-line. Fig. 4 shows circuitschematic of sense amplifier that used for reading datafrom new cell. 4)-Idle mode: At the end of readoperation, cell will go to idle mode and read-line andbit-line asserted to VOD and GND, respectively.

5. Cell areaFig. 5 shows possible layout of 5T SRAM cell in

Page 3: azizi mazreah-5T

Also in [6] from the execution traces of the SPEC2000benchmarks on average, almost 75% and 64% of bitvalues are zero in the data and instruction caches,respectively. Thus most of bit values resident in the dataand instruction caches are zero. Based on theseobservations we simulated average leakage current inidle mode of 5T SRAM cell and conventional 6T SRAMcell in standard 0.25Jlm CMOS technology. The averageleakage current of new cell and 6T cell is 156nA and145nA, respectively. Thus the average leakage current ofnew cell is 8% grater than conventional 6T SRAM cell.Since the cache based on new cell contains othercomponent except cell array, the effect of leakage currentof cells on total leakage current of cache is less than 8%.

7. Dynamic power consumptionIn a cache, the major dynamic power consumingcomponents are bit-lines, word-lines, sense amplifiers,decoders and output drivers [2]. In general, the bit-linesare the most power consuming component [2]. Thereforedynamic power consumption during cache accessconsumed due to the transitions occur during read/writeoperation. The power consumption of each cache accessdepends on type of cache access (read or write).In 6T SRAM cell because one of two bit-lines must bedischarged to low regardless of written value, thepower consumption in both writing '0' and 'I' are thesame [2]. Also in read operation one of the twobit-lines must be discharged regardless of the storeddata value. Therefore the power consumption in bothwriting '0' and 'I' are the same [7].From our HSPICE simulation in standard 0.25Jlmtechnology with 2.5V for power supply we obtained thepower consumption of reading and writing' I' in new cellapproximately is equal with power consumption ofreading and writing' I' in 6T cell . Also both the powerconsumption of reading and writing '0' in new cell aresmaller than power consumption of reading and writing'0' in 6T cell.Cache accesses include read and write operations. Cachereads occur more frequently than writes, especially forthe instruction cache [2]. In [2] from the execution tracesof the SPEC2000 benchmarks around 85% of theinstruction write bits are '0' and over 90% of the datawrite bits are '0'. Also over 70% of the bits that are readfrom the cache are zeros [7]. Based on theseobservations we simulated average dynamic powerconsumption in a cache access using HSPICE instandard 0.25Jlm technology. The powers are measuredat 100 MHz with VDD=2.5V. The average dynamic powerconsumption of new cell and 6T cell is 5.3mW and7.1 mW, respectively. Thus the average dynamic powerconsumption of new cell is 25% smaller than 6T cell. Forthis reduction there are two obvious reasons as follows.First, in writing and reading zero of new cell there is not

3.7 IJm

standard O.25Jlm CMOS technology design rules. Alsofor comparison, Fig. 5 shows layout of 6T SRAM celland 5T SRAM cell in standard 0.25Jlm CMOStechnology design rules. The 6T SRAM cell has theconventional layout topology and is as compact aspossible. The 6T SRAM cell requires 23.94Jlm2 area in0.25Jlm technology, whereas 5T SRAM cell requires19.61Jlm2 area in 0.25Jlm technology. These numbers donot take into account the potential area reductionobtained by sharing with neighboring cells. Thereforethe new cell size is 18% smaller than a conventionalsix-transistor cell using same design rules.

VDD P~f()Swith

/"'I"rrl Lo.. !15enseAmPllfierEnablel

DO (Data Out)

BDC-1

\Bit-Line De-charge Control

DBO (i5iiiO'Ui).......I---<)C

Figure 4. Circuit schematic of sense amplifiero P Ile' 1 Ii Wp 1' II Me t i1 • 1

• M0t.J~2 mJEEl Pu:}'

• Active

3.8 pmArea of De'" ceU=19.61,...ml Area of 6T ce11=23.94pm1

Figure 5. Layout comparison of 5T cell and 6T Cell6. Leakage current

In one state, novel 5T SRAM cell must retains its datausing the leakage current of the access transistor (whenzero stored) and in the other state the 5T SRAM cellmust retains its data using positive feedback (when onestored). Thus in idle mode when' l' stored in cell, there ispositive feedback and M2, M3 and feedback cutting (M5)transistors are ON and access transistor maintained insub-threshold condition. In this state there is path fromsupply voltage to ground and power dissipated. Fig. 6shows this path when 'I' stored in cell.

In ordinary programs most of the bits in caches arezeroes for both the data and instruction streams. It hasbeen shown that this behavior persists for a variety ofprograms under different assumptions about cache sizes,organization and instruction set architectures [5] [6].

Page 4: azizi mazreah-5T

any change on bit-line since bit-line maintained at GNDin idle mode. second, reading '0' or writing '0' occursmore frequently than reading' l' or writing' 1'.

2

''''rite '1'._. m. ~

2

1

o j~~~~~$'~~''''''-~'''~~~-T~-'T-'~T--'''-~T~'''''~'''-~~~~~~~30 40 50

Time (lin) (TIME)

Figure 8. Simulated waveform for writing' 1' and read it

21

o !~::::;;J~I:m:::,:::.r:::;:::;'::;:::';:::::r':*~~~~~~~*~~~~~

2 I I1 • • Read-Uneo.~~~~'~~~!:!t~~!l!!!!!_~~~~~~__=~

010

References

[1] K. Takeda et aI., "A read-static-noise-margin-freeSRAM cell for low-VDD and high-speedapplications," IEEE Journal of Solid-State Circuits,vol. 41, no. 1, p. 133 (2006).

[2] G Y. J. Chang, F. Lai, and C. L. Yang, "Zero-AwareAsymmetric SRAM Cell for Reducing Cache Powerin Writing Zero," IEEE Transactions on Very LargeScale Integration Systems, vol. 12, no. 8, p. 827(2004).

[3] J. A. Kotabe, K. Osada, N. Kitai, M. Fujioka, S.Kamohara, M. Moniwa, S. Morita, and Y. Saitoh, "ALow-Power Four-Transistor SRAM Cell With aStacked Vertical Poly-Silicon PMOS and aDual-Word-Voltage Scheme/' IEEE JournalSolid-State Circuits, vol. 40, no. 4, p. 870 (2005).

[4] L. Chang et aI., "Stable SRAM cell design for the 32nm node and beyond,'· in Symposium VLSITechnology Dig., p. 128 (2005).

[5] N. Azizi, F. Najm, and A. Moshovos, '~Low-Ieakage

asymmetric-cell SRAM," IEEE Transactions on VeryLarge Scale Integration Systems, vol. 11, no. 4, p.701 (2003).

[6] A. Moshovos, B. Falsafi, F. N. Najm, and N. Azizi,"A Case for Asymmetric-Cell Cache Memories,"IEEE Transactions on Very Large Scale IntegrationSystems, vol. 13, no. 7, p. 877 (2005).

[7] L. Villa, M. Zhang, and K. Asanovic, '~Dynamic zerocompression for cache energy reduction," inProceedings 33rd Annual IEEE/ACM InternationalSymposium Microarchitecture, p. 214 (2000).

VDD

Read-Une

Ml .....--...__--......IP-I

r---------'ST=High

_" \ Path from supply- voltage to ground

when 111 stored in cell. _Figure 6. Paths from supply voltage to ground when '1'

stored in cell

8. Experimental resultsTo verify correct operation of new 5T SRAM cell and

comparison with 6T SRAM cell, we simulate a 4TSRAM cell and 6T SRAM cell using HSPICE instandard 0.25J.lm CMOS technology with 2.5V forsupply voltage and 0.5V for threshold of PMOS andNMOS transistors. Based on layouts shown in Fig. 5, allparasitic capacitances and resistances of bit-lines,word-lines are included in the circuit simulation. Fortesting the correctness of a read and write operation ofnew 4T SRAM cell, following scenario applied to new4T SRAM cell: a)-Writing '0' in to new cell and thenread it. b)-Writing' 1' in to new cell and then read it.

Fig. 7 and Fig. 8 show simulated waveform withapplying above scenario. The average cache access delayis defined as the average elapsed time for performingread or writes operation. Average cache access delay of5T SRAM cell is 742ps and Average cache access delayof 6T SRAM cell is 930ps. Therefore delay of new cell is200/0 smaller than 6T cell.