1
Semiconductor MemoriesSemiconductor Memories
2
Semiconductor Memory ClassificationSemiconductor Memory Classification
Read-Write MemoryNon-VolatileRead-Write
Memory
Read-Only Memory
EPROM
E2PROM
FLASH
RandomAccess
Non-RandomAccess
SRAM
DRAM
Mask-Programmed
Programmable (PROM)
FIFO
Shift Register
CAM
LIFO
3
Memory Timing: DefinitionsMemory Timing: Definitions
Write cycleRead access Read access
Read cycle
Write access
Data written
Data valid
DATA
WRITE
READ
4
Memory Architecture: DecodersMemory Architecture: Decoders
Word 0
Word 1
Word 2
WordN22
WordN21
Storagecell
M bits M bits
S0
S1
S2
SN22
A0
A1
AK 21
K 5 log2N
SN21
Word 0
Word 1
Word 2
WordN22
WordN21
Storagecell
S0
Input-Output(M bits)
Intuitive architecture for N x M memoryToo many select signals:
N words == N select signals K = log2NDecoder reduces the number of select signals
Input-Output(M bits)
5
2D Memory Architecture2D Memory Architecture
A0
Row
Dec
oder
A1
Aj-1Sense Amplifiers
bit line
word line
storage (RAM) cell
Row
Add
ress
Col
umn
Add
ress
Aj
Aj+1
Ak-1
Read/Write Circuits
Column Decoder
2k-j
m2j
Input/Output (m bits)
amplifies bit line swing
selects appropriate word from memory row
6
Hierarchical Memory ArchitectureHierarchical Memory Architecture
Advantages:Advantages:1. Shorter wires within blocks1. Shorter wires within blocks2. Block address activates only 1 block => power savings2. Block address activates only 1 block => power savings
Globalamplifier/driver
Controlcircuitry
Global data bus
Block selector
Block 0
Rowaddress
Columnaddress
Blockaddress
Blocki BlockP 2 1
I/O
7
Memory Timing: ApproachesMemory Timing: Approaches
DRAM TimingMultiplexed Adressing
SRAM TimingSelf-timed
Addressbus
RAS
RAS-CAS timing
Row Address
AddressBus
Address transitioninitiates memory operation
Address
Column Address
CAS
8
Read-Write Memories (RAMs)Read-Write Memories (RAMs)
Static – SRAM data is stored as long as supply is applied large cells (6 fets/cell) – so fewer bits/chip fast – so used where speed is important (e.g., caches) differential outputs (output BL and !BL) use sense amps for performance compatible with CMOS technology
Dynamic – DRAM periodic refresh required small cells (1 to 3 fets/cell) – so more bits/chip slower – so used for main memories single ended output (output BL only) need sense amps for correct operation not typically compatible with CMOS technology
9
Read-Only Memory CellsRead-Only Memory Cells
WL
BL
WL
BL
1WL
BL
WL
BL
WL
BL
0
VDD
WL
BL
GND
Diode ROM MOS ROM 1default output = 0
MOS ROM 2default output = 1
BL is pulled down to GNDwhen on activation of WL
10
MOS OR ROMMOS OR ROM
WL[0]
VDD
BL[0]
WL[1]
WL[2]
WL[3]
Vbias
BL[1]
Pull-down loads
BL[2] BL[3]
VDD
11
MOS NOR ROMMOS NOR ROM
WL
[0]
GND
BL [0]
WL [1]
WL [2]
WL [3]
VDD
BL [1]
Pull-up devices
BL [2] BL [3]
GND
12
MOS NOR ROM LayoutMOS NOR ROM Layout
Programming using theActive Layer Only, Activemask in fabrication process
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
Cell (9.5 x 7)
WL
[0]
GND
WL [1]
WL [2]
GND
WL [3]
using diffusion for GND
13
MOS NOR ROM LayoutMOS NOR ROM Layout
Polysilicon
Metal1
Diffusion
Metal1 on Diffusion
Cell (11 x 7)
Programmming usingthe Contact Layer Onlyselective addition ofmetal-to-diffusion contacts
WL
[0]
GND
WL [1]
WL [2]
GND
WL [3]
14
MOS NAND ROMMOS NAND ROM
All word lines high by default with exception of selected row
WL [0]
WL [1]
WL [2]
WL [3]
VDD
Pull-up devices
BL[3]BL [2]BL [1]BL [0]
15
Equivalent Transient Model for MOS NOR ROMEquivalent Transient Model for MOS NOR ROM
Word line parasitics Wire capacitance and gate capacitance Wire resistance (polysilicon)
Bit line parasitics Resistance not dominant (metal) Drain and Gate-Drain capacitance
Model for NOR ROM VDD
Cbit
rword
cword
WL
BL
16
Equivalent Transient Model for MOS NAND ROMEquivalent Transient Model for MOS NAND ROM
Word line parasitics Similar to NOR ROM
Bit line parasitics Resistance of cascaded transistors dominates Drain/Source and complete gate capacitance
Model for NAND ROMVDD
CL
rword
cword
cbit
rbit
WL
BL
17
Decreasing Word Line DelayDecreasing Word Line Delay
Metal bypass
Polysilicon word lineK cells
Polysilicon word lineWL
Driver
(b) Using a metal bypass
(a) Driving the word line from both sides
Metal word line
WL
(c) Use silicides
driving from both sides reduces the worst case delay of the word line by 4 (like buffer insertion to reduce RC delay)
18
Precharged MOS NOR ROMPrecharged MOS NOR ROM
PMOS precharge device can be made as large as necessary,but clock driver becomes harder to design.
WL [0]
GND
BL [0]
WL [1]
WL [2]
WL [3]
VDD
BL [1]
Precharge devices
BL [2] BL [3]
GND
pref
19
Non-Volatile MemoriesNon-Volatile MemoriesThe Floating-gate transistor (FAMOS)The Floating-gate transistor (FAMOS)
Floating gate
Source
Substrate
Gate
Drain
n+ n+_p
tox
tox
Device cross-section Schematic symbol
G
S
D
20
Floating-Gate Transistor ProgrammingFloating-Gate Transistor Programming
0 V
-5V0 V
DS
Removing programming voltage leaves charge trapped
5 V
-2.5 V 5 V
DS
Programming results in higher VT.
20 V
10 V 5 V 20 V
DS
Avalanche injection
21
A “Programmable-Threshold” TransistorA “Programmable-Threshold” Transistor
“0”-state “1”-state
DVT
VWL VGS
“ON”
“OFF”
Applying VWL results in either current to flow (“0” state) or not (“1” state)
22
FLOTOX EEPROMFLOTOX EEPROM
Floating gate
Source
Substratep
Gate
Drain
n1 n1
FLOTOX (floating-gate tunnelingoxide) transistor
Fowler-Nordheim I-V characteristic
20–30 nm
10 nm
-10 V
10 V
I
VGD
23
EEPROM CellEEPROM Cell
WL
BL
VDD
2 transistor cell
EEPROM cell as configured during readoperation. When programmed, the threshold voltage of the FLOTOX ishigher than VDD, disabling it. If not, it actsas a closed switch.
24
Flash EEPROMFlash EEPROM
Control gate
erasure
p-substrate
Floating gate
Thin tunneling oxide
n1source n1drainprogramming
Many other options …
25
Basic Operations in a NOR Flash Memory―Basic Operations in a NOR Flash Memory―EraseErase
S D
12 VG
cell arrayBL0 BL1
open open
WL0
WL1
0 V
0 V
Electrons at the floating gate are ejected to the source by tunneling. Gate is at GND, Source is at 12 V
26
Basic Operations in a NOR Flash Memory―WriteBasic Operations in a NOR Flash Memory―Write
S D
12 V
6 VG
BL0 BL1
6 V 0 V
WL0
WL1
12 V
0 V
For the write operation, a high voltage 12v is applied at gate and 6v applied at drain, i.e., BL to write a “1”. Hot electrons are injected into the floating gate, raising the threshold, effectively turning off the device.
Floating gate has no electrons for “0” stste
Floating gate has electrons for ‘1” state
27
Basic Operations in a NOR Flash Memory―Basic Operations in a NOR Flash Memory―ReadRead
5 V
1 VG
S D
BL0 BL1
1 V 0 V
WL0
WL1
5 V
0 V
For a read operation, the selected word line is raised to 5V, turning on the transistor if “0” state stored, the transistor remains off if “1” state stored.
28
6-transistor CMOS SRAM Cell 6-transistor CMOS SRAM Cell WL
BL
VDD
M5M6
M4
M1
M2
M3
BL
Consumes power only when switching - no standby power (other than leakage) is consumed
The major job of the pullups is to replenish loss due to leakageSizing of the transistors is critical
29
CMOS SRAM Analysis (Read)CMOS SRAM Analysis (Read)WL
BL
VDD
M 5
M 6
M 4
M1VDDVDD VDD
BL
Q = 1Q = 0
Cbit Cbit
First precharge both bit lines – BL and !BL – to 1
Then discharge !BL through M5 and M1
CR = (W/L)1 / (W/L)5
M5 with bigger L isdesired
30
SRAM Cell Analysis (Read)SRAM Cell Analysis (Read)
!BL=1 BL=1
WL=1
M1
M4
M5M6
Q=1!Q=0
CbitCbit
Read-disturb (read-upset): must carefully limit the allowed voltage rise on !Q to a value that prevents the read-upset condition from occurring while simultaneously maintaining acceptable circuit speed and area constraints
the resistance of M5 must be larger than M1 (M5 with bigger L, so that M1 is faster than M5)
31
CMOS SRAM Analysis (Read)CMOS SRAM Analysis (Read)
0
0
0.2
0.4
0.6
0.8
1
1.2
0.5 1 1.2 1.5 2Cell Ratio (CR)
2.5 3
Vo
ltage
Ris
e (
V)
The voltage does not rise above the threshold for CR > 1.2
Vdd = 2.5VVTn = 0.5V
32
CMOS SRAM Analysis (Write) CMOS SRAM Analysis (Write)
BL = 1 BL = 0
Q = 0
Q = 1
M1
M4
M5
M6
VDD
VDD
WL
1 is stored, trying to write a 0
In order to write the cell, the pass gate M6 must be more conductive than the M4 to allow node Q to be pulled to a value low enough for the inverter pair (M2/M1) to begin amplifying the new data.
Pullup Ratio (PR) = (WM4/LM4)/(WM6/LM6)
33
CMOS SRAM Analysis (Write)CMOS SRAM Analysis (Write)
Pullup Ratio (PR) = (WM4/LM4)/(WM6/LM6)
Vdd = 2.5V|VTp| = 0.5V
p/n = 0.5
34
Cell SizingCell Sizing
Keeping cell size minimized is critical for large caches Minimum sized pull down fets (M1 and M3)
Requires minimum width and longer than minimum channel length pass transistors (M5 and M6) to ensure proper CR
But sizing of the pass transistors increases capacitive load on the word lines and limits the current discharged on the bit lines both of which can adversely affect the speed of the read cycle
Minimum width and length pass transistors Boost the width of the pull downs (M1 and M3) Reduces the loading on the word lines and increases the
storage capacitance in the cell – both are good! – but cell size may be slightly larger
35
6T-SRAM — Layout 6T-SRAM — Layout
VDD
GND
WL
BLBL
M1 M3
M4M2
M5 M6
36
Resistance-load SRAM CellResistance-load SRAM Cell
Static power dissipation -- Want RL largeBit lines precharged to VDD
M3
RL RL
VDD
WL
Q Q
M1 M2
M4
BL BL
also known as the 4-transistor SRAM cell, reducing
the SRAM size by one-third.
37
SRAM CharacteristicsSRAM Characteristics
38
Multiple Read/Write Port CellMultiple Read/Write Port CellWL2
BL2!BL2
M7 M8
!BL1 BL1
WL1
M1
M2
M3
M4
M5 M6Q!Q
For the case of reads, with more than one pass gate open, the voltage rise in the cell will be larger and thus the size of the pulldown will have to be increased to maintain an acceptably low level (by a factor equal to the number of simultaneous open read ports).
39
3-Transistor DRAM Cell3-Transistor DRAM Cell
No constraints on device ratios
Reads are non-destructive
Value stored at node X when writing a “1” = VWWL-VTn
WWL
BL1
M1 X
M3
M2
CS
BL2
RWL
VDD
VDD - VT
VVDD- VTBL2
BL1
X
RWL
WWL
Core of first popular MOS memories (e.g., first 1Kbit memory from Intel). Cs is data storage (internal capacitance of wiring, M2 gate, and M1 diffusion capacitances
Write: uses WWL and BL1
40
Read: uses RWL and BL2. Assume BL2 precharged to Vdd (or Vdd-Vt). If cell is holding 1, then BL2 goes low – so reads are inverting.
WWL
BL1
M1 X
M3
M2
CS
BL2
RWL
VDD
VDD -VT
VVDD - VTBL 2
BL 1
X
RWL
WWL
Write: uses WWL and BL1
Refresh: read stored data, put its inverse on BL1 and assert WWL (need to do this every 1 to 4 msec)
41
3T-DRAM — Layout3T-DRAM — Layout
BL2 BL1 GND
RWL
WWL
M3
M2
M1
42
1-Transistor DRAM Cell1-Transistor DRAM Cell
M1 X
BL
WL
X Vdd-Vt
WLwrite“1”
BL Vdd
Write: Cs is charged (or discharged) by asserting WL and BLRead: Charge redistribution occurs between CBL and Cs
Cs
read“1”
Vdd/2 sensing
Read is destructive, so must refresh after read
CBL
43
1-T DRAM Cell1-T DRAM Cell
Uses Polysilicon-Diffusion Capacitance
Expensive in Area
M1 wordline
Diffusedbit line
Polysilicongate
Polysiliconplate
Capacitor
Cross-section Layout
Metal word line
Poly
SiO2
Field Oxiden+ n+
Inversion layerinduced byplate bias
Poly
44
DRAM Cell ObservationsDRAM Cell Observations1. 1T DRAM requires a sense amplifier for each bit line,
due to charge redistribution read-out.
2. DRAM memory cells are single ended in contrast to SRAM cells.
3. The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation.
4. Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design.
5. When writing a “1” into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than VDD
45
Sense Amp OperationSense Amp Operation
DV(1)
V(1)
V(0)
t
VPRE
VBL
Sense amp activatedWord line activated
VPRE : precharged voltage
46
Static CAM Memory CellStatic CAM Memory Cell
CAM
Bit
Word
Bit
••• CAM
Bit Bit
CAM
Word
Wired-NOR Match Line
Match M1
M2
M7M6
M4 M5M8 M9
M3int
SWord
••• CAM
Bit Bit
S
If Bit=VDD and S = 0, then int charges up to through M3, Match discharge to “0”
Match precharged to VDD
47
CAM in Cache MemoryCAM in Cache Memory
CAM
ARRAY
Input Drivers
Tag HitAddress
SRAM
ARRAY
Sense Amps / Input Drivers
DataR/W
48
Periphery Memory Circuit Periphery Memory Circuit
Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry
49
Row DecodersRow DecodersCollection of 2M complex logic gatesOrganized in regular and dense fashion
(N)AND Decoder
NOR Decoder
_ _ _ _ _ _ _ _ _ _
_
50
Hierarchical DecodersHierarchical Decoders
• • •
• • •
A2A2
A 2A3
WL 0
A2A3A2A 3A2A3
A3 A3A 0A0
A0A 1A 0A1A0A1A0A1
A 1 A1
WL 1
Multi-stage implementation improves performance
NAND decoder usingNAND decoder using2-input pre-decoders2-input pre-decoders
51
Dynamic DecodersDynamic Decoders
Precharge devices
VDD
GND
WL3
WL2
WL1
WL0
A0A0
GND
A1A1
WL3
A0A0 A1A1
WL 2
WL 1
WL 0
VDD
VDD
VDD
VDD
2-input NOR decoder 2-input NAND decoder
All address signals are low during precharge
precharge all outputs high, then GND inactive outputs - active “high” output signals
52
4-input pass-transistor based column 4-input pass-transistor based column decoderdecoder
Advantages: speed (tpd does not add to overall memory access time) Only one extra transistor in signal pathDisadvantage: Large transistor count
A0S0
BL 0 BL 1 BL 2 BL 3
A1
S1
S2
S3
D
2-input
NOR
decoder
53
Pass Transistor Based Column DecoderPass Transistor Based Column DecoderBL3 BL2 BL1 BL0
Data
2 in
put
NO
R d
ecod
erA1
A0
S3
S2
S1
S0
Advantage: speed since there is only one extra transistor in the signal path
Disadvantage: large transistor count
!BL3 !BL2 !BL1 !BL0
!Data
54
4-to-1 tree based column decoder4-to-1 tree based column decoder
Number of devices drastically reducedDelay increases quadratically with # of sections; prohibitive for large decoders
buffersprogressive sizingcombination of tree and pass transistor approaches
Solutions:
BL 0 BL 1 BL 2 BL 3
D
A 0
A 0
A1
A 1
55
Decoder for circular shift-registerDecoder for circular shift-register
VDD
VDD
R
WL0
VDD
f
ff
f
VDD
R
WL1
VDD
f
ff
f
VDD
R
WL2
VDD
f
ff
f• • •
R signal resets the pointer to the first position
56
Sense AmplifiersSense Amplifiers
tpC V
Iav----------------=
make V as smallas possible
smalllarge
Idea: Use Sense Amplifer
outputinput
s.a.smalltransition
57
Differential Sense AmplifierDifferential Sense Amplifier
Directly applicable toSRAMs
M4
M1
M5
M3
M2
VDD
bitbit
SE
Outy
58
Differential Sensing ― SRAMDifferential Sensing ― SRAMVDD
VDD
VDD
VDD
BL
EQ
Diff.SenseAmp
(a) SRAM sensing scheme (b) two stage differential amplifier
SRAM cell i
WL i
2xx
VDD
Output
BL
PC
M3
M1
M5
M2
M4
x
SE
SE
SE
Output
SE
x2x 2x
Precharge Bit lines to Vdd by pulling pc low
59
Latch-Based Sense Amplifier (DRAM)Latch-Based Sense Amplifier (DRAM)
Initialized in its meta-stable point with EQOnce adequate voltage gap created, sense amp enabled with SEPositive feedback quickly forces output to a stable operating point.
EQ
VDD
BL BL
SE
SE
60
Charge-Redistribution AmplifierCharge-Redistribution Amplifier
0.5
1.0
1.5
2.0
2.5
0.0
0.0 1.00 2.00time (nsec)
V
Vin
Vref5 3V
VL
VS
(b) Transient response
3.00
Concept
M2 M3
M1VL VS
Vref
CsmallClarge
Transient Response
61
Charge-Redistribution Amplifier―Charge-Redistribution Amplifier―EPROMEPROM
SE
VDD
WLC
Load
Cascodedevice
Columndecoder
EPROMarray
BL
WL
Vcasc
Out
Cout
Ccol
CBLM1
M2
M3
M4
62
Single-to-Differential ConversionSingle-to-Differential Conversion
How to make a good Vref?
Diff.S.A.Cell
2xx
Output
WL
Vref
BL
12
63
Open bitline architecture with Open bitline architecture with dummy cellsdummy cells
CS CS CS CS
BLL
L L1 L0 R0
CS
R1
CS
L
… …
BLR
VDD
SE
SE
EQ
Dummy cell Dummy cell
64
DRAM Read Process with Dummy CellDRAM Read Process with Dummy Cell3
2
1
00 1 2 3
BL
BL
t (ns)
reading 03
2
1
00 1 2 3
SE
EQ WL
t (ns)
control signals
3
2
1
00 1 2 3
BL
BL
t (ns)
reading 1
65
Voltage RegulatorVoltage Regulator
-
+
VDD
VREF
Vbias
Mdrive
Mdrive
VDL
VDL
VREF
Equivalent Model
66
Charge PumpCharge Pump
CLK
VDD
A BM1
M2Vload
Cload
Cpump
2VDD 2 VT
VDD 2 VT
0 V
VB
Vload
0 V
67
DRAM TimingDRAM Timing
68
RDRAM ArchitectureRDRAM Architecture
memoryarray
Databus
Clocks
Column
Rowdemux packet dec.
packet dec.
Bus
k k3 l
demux
69
Address Transition DetectionAddress Transition Detection
DELAYtdA0
DELAYtdA1
DELAYtdAN2 1
VDD
ATD ATD
…
70
Reliability and YieldReliability and Yield
71
Sensing Parameters in DRAMSensing Parameters in DRAM
From [Itoh01]
4K
10
100
1000
64K 1M 16M256M 4G 64GMemory Capacity (bits/chip)
CD(1F)
CS(1F)
QS(1C)
Vsmax(mv)
VDD(V)
QS 5 CS VDD/2Vsmax 5 QS/(CS 1 CD)
72
Noise Sources in 1T DRamNoise Sources in 1T DRam
Ccross
electrode
a-particles
leakage CS
WL
BL substrate Adjacent BL
CWBL
73
Open Bit-line Architecture —Cross CouplingOpen Bit-line Architecture —Cross Coupling
SenseAmplifierC
WL1
BL
CBL
CWBL CWBL
CC
WL0
CCBL
C C
WLD WLD WL0 WL1
BL
EQ
74
Folded-Bitline ArchitectureFolded-Bitline Architecture
SenseAmplifier
C
WL1
CWBL
CWBL
C
WL0 WL0 WLD
CC
WL1
CC
WLD
BL CBL
BL CBL
EQ
x
x
y
75
Transposed-Bitline ArchitectureTransposed-Bitline Architecture
SA
Ccross
(a) Straightforward bit-line routing
(b) Transposed bit-line architecture
BL9
BL
BL
BL99
SA
Ccross
BL9
BL
BL
BL99
76
Alpha-particles (or Neutrons)Alpha-particles (or Neutrons)
1 Particle ~ 1 Million Carriers
WL
BL
VDD
n1
a-particle
SiO21
111
11
22
22
22
77
YieldYield
Yield curves at different stages of process maturity(from [Veendrick92])
78
RedundancyRedundancy
MemoryArray
Column Decoder
Redundantrows
Redundantcolumns
RowAddress
ColumnAddress
FuseBank:
79
Error-Correcting CodesError-Correcting Codes
Example: Hamming Codes
with
e.g. B3 Wrong
1
1
0
= 3
80
Redundancy and Error CorrectionRedundancy and Error Correction
81
Sources of Power Dissipation in Sources of Power Dissipation in MemoriesMemories
PERIPHERY
ROWDEC
selected
non-selected
CHIP
COLUMN DEC
nCDEV INTf
mCDEV INTf
CPTV INTf
IDCP
ARRAY
m
n
m(n21)ihld
miact
VDD
VSS
IDD 5 SCiDV if1S IDCP
From [Itoh00]
82
Data Retention in SRAMData Retention in SRAM1.30u
1.10u
900n
700n
500n
300n
100n
0.00 .600 1.20 1.80
Factor 7
0.13 m CMOSm
0.18 m CMOSm
VDD
I lea
kag
e
SRAM leakage increases with technology scaling
83
Suppressing Leakage in SRAMSuppressing Leakage in SRAM
SRAMcell
SRAMcell
SRAMcell
VDD,int
VDD
VDD VDDL
VSS,int
sleep
sleep
SRAMcell
SRAMcell
SRAMcell
VDD,int
sleep
low-threshold transistor
Reducing the supply voltageReducing the supply voltageInserting Extra ResistanceInserting Extra Resistance
84
Data Retention in DRAMData Retention in DRAM101
100
102 1
102 2
102 3
102 4
102 5
102 6
15M 64M 255M 1G 4G 15G 64G
Capacity (bit)
Curr
ent
(A)
3.3 2.5 2.0 1.5 1.2 1.0 0.8
Operating voltage (V)
0.53 0.40 0.32 0.24 0.19 0.16 0.13
Extrapolated threshold voltage at 25 C (V)
IACT
IAC
IDC
Cycle time : 150 nsT 5 75 C,S
From [Itoh00]
85
Case StudiesCase Studies
Programmable Logic Array SRAM Flash Memory
86
PLA versus ROMPLA versus ROM Programmable Logic Array
structured approach to random logic“two level logic implementation”
NOR-NOR (product of sums)NAND-NAND (sum of products)
IDENTICAL TO ROM!
Main differenceROM: fully populatedPLA: one element per minterm
Note: Importance of PLA’s has drastically reduced1. slow2. better software techniques (mutli-level logic
synthesis)But …
87
Programmable Logic ArrayProgrammable Logic Array
GND GND GND GND
GND
GND
GND
VDD
VDD
X0X0 X1 f0 f1X1 X2X2
AND-plane OR-plane
Pseudo-NMOS PLA
88
Dynamic PLADynamic PLA
GND
GNDVDD
VDD
X0X0 X1 f0 f1X1 X2X2
ANDf
ANDf
ORf
ORf
AND-plane OR-plane
89
Clock Signal Generation Clock Signal Generation for self-timed dynamic PLAfor self-timed dynamic PLA
f
tpre teval
f AND
f
f AND
f AND
fOR
fOR
(a) Clock signals (b) Timing generation circuitry
Dummy AND row
Dummy AND row
90
PLA LayoutPLA LayoutVDD GNDAnd-Plane Or-Plane
f0 f1x0 x0 x1 x1 x2 x2
Pull-up devices Pull-up devices
91
4 Mbit SRAM4 Mbit SRAMHierarchical Word-line ArchitectureHierarchical Word-line Architecture
Global word line
Sub-global word line
Block groupselect
Blockselect
Blockselect
Memory cell
Localword line
Block 0
•••
Localword line
Block 1
•••
Block 2...
•••
92
Bit-line CircuitryBit-line Circuitry
Bit-lineload
Blockselect ATD
BEQ
Local WL
Memory cell
I/O lineI/O
B/T
CD
Sense amplifier
CD CD
I/O
B/T
93
Sense Amplifier (and Waveforms)Sense Amplifier (and Waveforms)
BS
I /O I /O
DATA
Blockselect ATD
BSSA SA
BS
SEQ
SEQ
SEQ
SEQSEQ
Dei
I/O Lines
Address
Data-cut
ATD
BEQ
SEQ
DATA
Vdd
GND
SA, SA
Vdd
GND
94
1 Gbit Flash Memory1 Gbit Flash Memory
Sense Latches(10241 32)3 8
Data Caches(10241 32)3 8
Sense Latches(10241 32)3 8
Data Caches(10241 32)3 8
Wo
rd L
ine
Dri
ver
Wo
rd L
ine
Dri
ver
Wo
rd L
ine
Dri
ver
Wo
rd L
ine
Dri
ver
512Mb Memory Array 512Mb Memory Array
BL0 BL1 ····· BL16895 BL16996 BL16897··· BL33791
SGDWL31
WL0SGS
Block0
BLT0
Block1023
Block0
Block1023
Bit Line Control CircuitBLT1
I/O
From [Nakamura02]
95
Writing Flash MemoryWriting Flash MemoryN
um
be
r of
me
mo
ry c
ell
s
0V 1V 2V
Vt of memory cells
Verify level5 0.8 V Word-line level5 4.5 V
(a)
3V 4V
Result of 4 timesprogram
100
0V 1V 2V
Vt of memory cells
3V 4V
102
104
106
108
Evolution of thresholds Final Distribution
From [Nakamura02]
96
125125mmmm22 1Gbit NAND Flash Memory 1Gbit NAND Flash Memory
10.7
mm
11.7mm
2kB
Pa
ge b
uffe
r &
ca
che
Ch
arg
e p
ump
16896 bit lines
32 word lines x 1024 blocks
From [Nakamura02]
97
125125mmmm22 1Gbit NAND Flash Memory 1Gbit NAND Flash Memory
Technology 0.13m p-sub CMOS triple-well 1poly, 1polycide, 1W, 2Al Cell size 0.077m2 Chip size 125.2mm2 Organization 2112 x 8b x 64 page x 1k block Power supply 2.7V-3.6V Cycle time 50ns Read time 25s Program time 200s / page Erase time 2ms / block
Technology 0.13m p-sub CMOS triple-well 1poly, 1polycide, 1W, 2Al Cell size 0.077m2 Chip size 125.2mm2 Organization 2112 x 8b x 64 page x 1k block Power supply 2.7V-3.6V Cycle time 50ns Read time 25s Program time 200s / page Erase time 2ms / block
From [Nakamura02]
98
Semiconductor Memory TrendsSemiconductor Memory Trends(up to the 90’s)(up to the 90’s)
Memory Size as a function of time: x 4 every three years
99
Semiconductor Memory TrendsSemiconductor Memory Trends(updated)(updated)
From [Itoh01]
100
Trends in Memory Cell AreaTrends in Memory Cell Area
From [Itoh01]
101
Semiconductor Memory TrendsSemiconductor Memory Trends
Technology feature size for different SRAM generations