radiation effects challenges in 90nm commercial-density srams
TRANSCRIPT
Radiation Effects Challenges in 90nm Commercial-Density SRAMs:
A Comprehensive SEE and TID Study
Jeff Draper, Y. Boulghassoul, M. Bajura, R. Naseer, J. Sondeen and S. Stansberry
University of Southern CaliforniaInformation Sciences Institute
This work was supported by the Defense Advanced Research Projects Agency (DARPA) Microsystems Technology Office under award No. N66001-04-1-8914
Any opinions, findings, and conclusions or recommendations expressed in this presentation are those of the authorsand do not necessarily reflect the views of DARPA/MTO or the U.S. Government
1st Workshop on Fault-Tolerant Spaceborne Computing Employing New TechnologiesCSRI, Sandia National Labs, Albuquerque, NM
May 28-30, 2008
2
Motivations
• RHBD approach shown to be effective for 90nm designs, within acceptable “1 process generation” penalty
• Use of RHBD for SRAMs poses bigger challenges SRAM density achieved through aggressive design rule waivers Cell-level radiation hardening using typical RHBD techniques
compounds area/speed/power penalties
• Traditional circuit-based RHBD approach Hardens control structures and individual memory cells SRAM BER largely determined by the raw BER of the memory cell
• Objective: Investigate best rad-hard SRAM performance achievable through hybrid hardening approach Harden control structures but leave commercial SRAM cell density
and technological scaling of individual memory cells intact Mitigate SEUs (SBU/MBU) with device-centric ECCs Leverage intrinsic process hardness for improved reliability
3
Outline
• SRAM test chips overview
• SEU responseHeavy-IonsProtons
• Latchup response
• TID and temperature annealing24C, 100C and 150C
• Summary and conclusions
4
Overview SRAM Test Chips
DESIGN ATTRIBUTE
COMMERCIAL “AS IS” - BASELINE
MODIFIED FOR SPACE OPERATIONS –
HARDENED
IBM PROCESS 9LP
Size (Total Bits) 65, 536 90, 112
Voltage (V) Core: 1.2 V; I/O; 2.5 Core: 1.2 V; I/O; 2.5
I/O Pads Radiation-Hardened Radiation-Hardened
Design Libraries: Phase-1 (RHBD) Phase-1 (RHBD)
Error Correction None 1 bit (Hamming; 22, 16, 4)
Area Overhead 1X 1.37X
IBM PROCESS 9SF
Size (Total Bits) 57,344 122,880
Voltage (V) Core: 1.0 V; I/O; 2.5 Core: 1.0 V; I/O; 2.5
I/O Pads Commercial (Artisan) Commercial (Artisan)
Design Libraries: Commercial (Artisan) Commercial (Artisan)
Error Correction None 2 bits (BCH; 15,7,5)
Area Overhead 1X 2.14X
• Fabricated 4 SRAMs in 9LP and 9SF processes (1 baseline, 1 hardened in each)
• Key design objectives: Use commercial core memory cells (FP118 and E123) Harden peripheral circuitry using TMR, annular gates, interleaving
5
RHBD SRAM Approach/Design
Block SET Hardening
Annular gates(TID)
De
cod
ers
, E
CC
, T
imin
g
#1
De
cod
ers
, E
CC
, T
imin
g
#2
De
cod
ers
, E
CC
, T
imin
g
#3
Vo
ter
TMR
321032103210
321032103210
321032103210
1
1
1
0
0
0
3
3
3
3
3
3
2
2
2
1
1
1
322100
322100
322100
Bit Interleaving (MBU Mitigation)
Guard rings(SEL)
Cell TID & SEL Hardening Array SEU Hardening (SEC/DED)
6
- Single Event Effects - SEU
7
SEU Raw Cross Sections HI Test Results
BA
SE
LIN
EH
AR
DE
NE
D
Data collected at LBNL 88” Cyclotron, 10 MeV cocktail, core voltage 10% below nominal, 100 MHz tester. Fluence range 1e7-1e5. #Errors>256 ea. pt.
LP SF
LP SF
Memory Patterns={00,11,10}, Static/Dynamic = {s, d}; LP Dynamic Access Rate 2.2 KHz per bit; SF Dynamic Access Rate 718 Hz
8
SEU Cross Section CalculationsPre-ECC and Scrubbing
Device Configuration
Weibull Cross-Section Parameters (per bit)
Raw Upsets / Bit-Day Before ECC and Scrubbing
Geo. OrbitEq. Orbit
3,000 km (Rad. Belts)Max GCR Min GCR
Solar Flares Worst Weeka X0 w s
sfb 3.3e-8 .30 23 1.3 2.3e-7 7.2e-8 1.4e-4 2.4e-4
sfh 3.3e-8 .43 28 1.2 2.0e-7 6.2e-8 8.6e-5 1.1e-4
lpb 1.8e-8 .73 36 .91 1.2e-7 3.8e-8 2.4e-5 2.0e-6
lph 1.8e-8 .40 34 1.1 1.1e-7 3.3e-8 3.7e-5 3.8e-5Calculations using CREME96, 100 mil shielding. AP8 model for equatorial orbit.
• Weibull(x) = [ a ]*[ 1-e{ ((x-x0)/w)) } ]
• SF cross-section ~ 2-3 X higher than LP, likely due to lower Vdd
• No cross-section dependence on static vs. dynamic testing
• Minor differences between baseline and hardened suggest little impact of TMR control circuitry
s
9 *. Figures assuming 22-bit code from “Models and Algorithmic Limits for an ECC-Based Approach to Hardening Sub-100nm SRAMs”, IEEE TNS Vol. 54, pp. 935-945, Aug. 2007.
SEU ModelECC and Scrubbing *
1. P(error) depends on the ratio of the device’s memory array SCRUB RATE and its RAW BER, and ECC applied
2. Overall reduction in error-rate is relative to starting physical raw BER. Example: Scrub rate=100, Physical
BER=10-6, Single-bit ECC, improves BER by 10-4 – New Effective BER 10-10
3. Goal: Assume once/10 seconds scrub rate and 10-10 BER; the device can tolerate up to 10-5 errors/bit-day with single-bit ECC, 10-2 errors/bit-day with double-bit ECC.
Constant 1E-10 BER curves vs. ECC and Scrub Rate
BER reduction vs. ECC and Scrub RateP(error) per scrub vs. ECC and Scrub Rate
1
3
2
10
Comparison of ECC Model with SEU Experimental Data
LPH STATIC SFH STATIC LPH DYNAMIC SFH DYNAMIC
Scrub Rate /bit 1 / Run 2.22 kHz 0.718 kHz
Raw BER Errors / Run / Memory SizeErrors / Run / Time (Secs) / Memory
Size
Effective BER Equation A Equation B Equation A Equation B
Total SBUS (All SEU Runs) 13,004 12,117 9,901* 9,372
Total Bit Errors Observed (Not Corrected by ECC)
7,305 195 NO ERRORS NO ERRORS
Total Bit Errors Predicted by the Model for a Given Scrub Rate, Raw BER and ECC
7,633 187 6.7 (< 1 word) 0.0002
Approximate Effective BER Equations for Single-bit-correcting 22-bit word and Double-bit-correcting 15-bit word
Raw BER
1 + Scrub Rate/Raw BER
300
Equation A ~ Raw BER
1 + Scrub Rate/Raw BER
15
2Equation B ~
(1-bitt ECC) (2-bitt ECC)
• Distribution of observed errors from measurements matches the ECC model very well
Proper error correction code (ECC) and modest scrubbing rate combination ensures a BER better than 10-10 errors/bit-day in all orbital scenarios
11
SBU and MBU analysis vs. Effective LET
Single and Multi Bit Upset Distributions vs. Effective LET for LP and SF SRAMs
9LP
9SF
• Large differences in the SBU/MBU distribution between LP and SF SRAMs for similar LET values
Particularly noticeable for LET> ~10 (MeV-cm2/mg)• Saturating cross-sections have LET-dependent error distributions
LET of 31 and 117 have comparable cross-sections but different distributions
12
SEU Proton Testing
SRAM type Bit configurations Average cross-section/bit.cm2
9LP Baseline All 0, All 1, 10 2.99E-14
9LP Hardened All 0, All 1, 10 2.84E-14
9SF Baseline All 0, All 1 5.92E-14
9SF Hardened All 0, All 1 7.24E-14
Saturating cross-section of 9SF and 9LP SRAMs for a 200MeV proton exposure.
• IBM 90nm commercial density SRAM cells have a very low upset threshold From 3D TCAD simulations, worst-case Qcrit ~1.1fC
• With an SRAM cell threshold LET < 0.5 MeV.cm2/mg, protons could potentially become capable of inducing SEUs through direct ionization Possible drastic increase in raw memory cell BER Could flood 1bit and possibly even 2 bit ECC schemes
• SRAM saturating cross-section still well behaved for worst-case 200Mev proton exposure (~ 10-14 errors/bit.cm2)
Proton upsets from nuclear interactions, no direct ionization yet @ 90nm
Data collected at Indiana University cyclotron facility, 200Mev line. Proton flux ~ 1010particles/cm2.s. Max TID for each tested part ~ 20Krad.
13
- Single Event Effects - Latchup
14
Latchup in 90nm SRAMs
Experimental
Condition
DeviceLP BASELINE LP HARDENED SF BASELINE SF HARDENED
High V
High Heat (worst case)
Vcore = 110% 1.32 V 1.1 V
Temp = High 125 C 125 C
LET Threshold Between 0.87 and 2.22 > 117
High V
No Heat
Vcore = 110% 1.32 V
Not applicable
Temp = Room ~ 24 C
LET Threshold > 117
Latch Up Onset and Release Voltages
Vcore Between 1.10 - 1.14 V
Temp = High 125 C
LET > 117
• SF appears to be SEL immune• LP appears to be SEL immune if, and only if:
– At room temperature over voltages up to 110% Vcore, OR– At lowered voltage over temperatures up to 125 C
• All SELs observed in LP were non-destructive• LP latchup appeared as a single step-function of ~50 mA
Data collected at LBNL 88” Cyclotron, 10MeV cocktail. High T applied through RTD strapped to PGA package and PID control
- Total ionizing Dose -
16
9LP/9SF TID and Room T0 Annealing Responses
All devices irradiated @ 200 rads/sec, Max Temp < 30 C, removed ~15 minutes for measurementLP irradiated under 10 pattern & measured under 01; SF irradiated under 00 pattern & measured under 11
• TID-induced Core leakage currents of Baseline and Hardened SRAMs were identical for a given process
TID response of SRAM Core is dominated by memory array leakage
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
0 500 1000 1500 2000Cummulated Dose (krads)
Le
ak
ag
e C
urr
en
t / B
it (
A)
9LP Core @ 1.2V9SF Core @ 1V
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
1 10 100 1000Annealing Time @ 24C (days)
Le
ak
ag
e C
urr
en
t / B
it (
A)
9LP Core @ 1.2V
9SF Core @ 1V
9LP SRAM:- ~ 1000X increase in Core leakage current @ 2Mrad- 4/4 devices functional failure [1000 <X<1300] krads- 4/4 devices fully functional after 7 days annealing- Leakage ~ 30X after 140 days
9SF SRAM: - ~ 50X increase in Core Leakage current @ 2Mrad- 20X pre-rad leakage but same level as LP @ 2Mrad- 4/4 devices functional failure [600<X<1000] krads- 2/4 devices fully functional after 7 days annealing- Leakage ~ 8X after 140 days
9LP and 9SF SRAM Core leakage currents dynamics as a function of TID and 24C anneal
17
9LP/9SF TID and Room T0 Annealing Responses (cont.)
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
1 10 100 1000
Annealing Time @ 24C (days)L
ea
ka
ge
Cu
rre
nt
(A)
9LP IO @ 2.5V
9SF IO @ 2.5V
All devices irradiated @ 200 rads/sec, Max Temp < 30 C, removed ~15 minutes for measurementLP irradiated under 10 pattern & measured under 01; SF irradiated under 00 pattern & measured under 11
• TID-induced IO leakage currents showed drastic differences between hardened and unhardened IO pads
Hardened pads should be used whenever possible Major impact on reliability at negligible performance penalties
9LP SRAM:- Hardened IO pads (MRC design)- ~ 1A IO leakage up to 2Mrad
9SF SRAM: - Unhardened IO pads (Artisan cells)- ~ 104 X increase in IO leakage @ 2Mrad- Leakage ~ 2000X after 140 days
9LP and 9SF SRAM IO leakage currents dynamics as a function of TID and 24C anneal
1.E-06
1.E-05
1.E-04
1.E-03
1.E-02
1.E-01
0 500 1000 1500 2000
Cummulated Dose (krads)
Le
ak
ag
e C
urr
en
t (A
)
9LP IO @ 2.5V
9SF IO @ 2.5V
18
9LP/9SF TID Responses for 100C and 150C anneals
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
0 50 100 150Annealing Time (hours)
Le
ak
ag
e C
urr
en
t / B
it (
A)
9LP Core @ 100C
9LP Core @ 150C
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
0 20 40 60 80 100Annealing Time (hours)
Le
ak
ag
e C
urr
en
t / B
it (
A)
9SF Core @ 100C
9SF Core @ 150C
• All 9LP and 9SF SRAMs respond very well to a temperature annealing For 100C, Core leakage currents are within 3X of pre-rad < 100 hours For 150C, pre-rad Core leakage levels are reached within 5 hours
9LP and 9SF SRAM Core leakage current variation as a function of annealing temperature
• The unhardened IO did not respond well 100C anneal is ineffective: still ~ 1000X above pre-rad after 200 hours 150C anneal is slightly better with 10X above pre-rad after 60 hours
19
TID Radiation Hysteresis
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
0 500 1000 1500 2000Cummulated Dose (krads)
Le
ak
ag
e C
urr
en
t / B
it (
A)
9LP Core @ 1.2V9LP Core after 58h@180C
9LP 2nd TID exposure
1.E-10
1.E-09
1.E-08
1.E-07
1.E-06
0 500 1000 1500 2000
Cummulated Dose (krads)
Lea
kag
e C
urr
ent
/ Bit
(A
)
9SF Core @ 1V
9SF Core after 58h@180C
9SF 2nd TID exposure
All SRAMs re-exposed a second time NEVER exhibited functional failure up to 2Mrad
• Successive TID exposure and annealing cycles induced a shift in the SRAM leakage characteristics:
Lateral shift: the SRAM start degrading sooner than in its first irradiation Vertical shift: the current “saturation” level is lowered But the true mystery improvement to the SRAM reliability is…
20
• Single Event Upsets (SEU) and Bit-Error-Rate (BER)– Proper ECC strength, bit interleaving and modest scrubbing rate combination
ensures an SRAM BER better than 10-10 errors/bit-day in all space environments investigated.
• Single Event Latch-up (SEL)– 9SF commercial memory cells are latch-up immune.– 9LP commercial memory cells are latch-up free only under high temperature or high
voltage, but not both.– Voltage scaling is likely to mitigate SEL concerns for core voltages < 1.1V.
• Total Ionizing Dose (TID)– 90nm SRAMs showed to be intrinsically resilient up to 300krad, but a substantial
static leakage increase happens past 500krad (10X).– TID Hardened IO pads should be used whenever possible.– 9LP/9SF SRAMs are very responsive to temperature treatments:
All SRAMs regained pre-rad nominal currents within 5 hours of 150°C annealing after 2Mrad TID exposure.
All ICs recovered from catastrophic loss of functionality.– Successive exposure and annealing cycles induced hysteresis in the SRAM
leakage characteristics: The current degradation starts earlier However, the maximum leakage at 2Mrad is lower than for the first irradiation.
– All ICs that underwent complete thermal anneal NEVER exhibited functional failure up to 2Mrad when re-exposed.
Summary and Conclusions