![Page 1: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/1.jpg)
Post-Manufacturing ECC Customization Based on Orthogonal Latin Square
Codes and Its Application to Ultra-Low Power Caches
Rudrajit Datta and Nur A. Touba
Computer Engineering Research CenterDept. of Electrical and Computer Engineering
University of Texas at Austin
![Page 2: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/2.jpg)
Motivation For memories with high defect rates
• Reduce check-bit overhead• Increase reliability
Applicable to low voltage caches
![Page 3: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/3.jpg)
Agenda Introduction Proposed Approach Application Related Work Orthogonal Latin Square (OLS) Codes Customization Results Conclusion
![Page 4: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/4.jpg)
Introduction Tolerate high defect rates for memories
• Occurs in memories operating at ultra-low voltages• Expected in future nanoscale technologies
– Eg. nanoscale crossbar architectures
Conventional method• ECC selected based on
– Expected number of maximum defects per word
![Page 5: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/5.jpg)
IntroductionData
Check Bit Generator
MemoryInformation Bits
cfullCheck Bits
Decoder
cfull
cfull
Corrected Data
![Page 6: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/6.jpg)
Observations A priori information available for location of defects
• Through post-manufacturing memory tests– Obtain a defect map
• Use information to customize code– Reduce check bit storage in memory/caches
![Page 7: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/7.jpg)
Proposed ApproachData
Check BitGenerator
SwitchNetwork
Memory
Information Bits cusedCheck Bits
Config.Bits
SwitchNetwork
Decoder
cfull
cfull
cused
cused
Corrected Data
![Page 8: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/8.jpg)
Proposed Approach Customize code by disabling rows of the H-matrix
• Possible if modular code used for ECC• Current work looks at OLS codes
Configuration Bits
൦
1010൪
![Page 9: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/9.jpg)
Application - Low-voltage Caches Microprocessor voltage lowered while idle
• Reduces power
Caches and memories susceptible at lower voltages• Unreliable below Vccmin
Enable reliable cache operation at lower voltages• At lower voltages use part of cache to store extra
check bits
![Page 10: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/10.jpg)
Related Work Word-disable and Bit-fix [Wilkerson 08]
• Defect map – Identify vulnerable bits
• Mitigates only persistent errors• Uses up half of the cache to store extra check-bits
Two-dimensional ECC [Kim 07] • Slow• Complicated decoding
Multi-bit segmented ECC [Chishti 09]• Orthogonal Latin Square (OLS) code
– Single step decodable• High redundancy
![Page 11: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/11.jpg)
Key Takeaways Have full ECC on chip
• Can handle all defect maps
Generate defect map• Disable part of the original code• Reduces check bit redundancy• Retain capability of original code w.r.t the defect map
![Page 12: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/12.jpg)
One Step Majority Decoding t-error correctable – information bit copied over 2t+1
times; each an independent copy One copy – bit itself Rest - 2t independent parity equations
cs
+dp
cp
+dq
cq
+ds
di
MajorityVoter
correcteddi
![Page 13: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/13.jpg)
Orthogonal Latin Square Codes Latin Square
• m x m array• Row-columns permutation of digits 0,1,…..m-1
Orthogonal Latin Squares• Ordered pair of elements (r, c, s) appear only once
m2 data bits, 2tm check bits, t-error correctable [Hsiao 70]
Single step decodable
![Page 14: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/14.jpg)
Proposed Scheme Implement full OLS code on chip
Run memory tests• Generate defect map
– At manufacturing time or at boot-time• Identify vulnerable bits
Disable rows in OLS H-matrix• On chip-by-chip basis, based on defect map• Correct all erasures PLUS ‘e’ random error in each
cache line• Reduce redundancy while providing same reliability
![Page 15: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/15.jpg)
Definitions “good row” – for information bit di
• Row of OLS H-matrix• No ‘1’ in any other erasure position save bit di
− Holds true for all lines In cache
“bad row” – for information bit di
• Row of OLS H-matrix• ‘1’ in one or more erasure positions apart from bit di
• Holds for at least one line of cache
![Page 16: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/16.jpg)
“Good Rows” & “Bad Rows”
d0 d1 d2 d3 d4 d5 d6d7
line1 - E - - - E - - line2 - - - E - - - - H-row11 0 0 0 1 0 1 1 H-row20 1 1 0 1 0 0 1 H-row31 0 0 1 0 1 1 0 H-row1 G - - - G - G G H-row2 - G B - B - - B H-row3 B - - B - B B -
![Page 17: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/17.jpg)
Necessary and Sufficient Conditions Tolerate ‘e’ random errors
• “good rows” – “bad rows” ≥ 2(e + 1)
Original code – t-error correcting• (Max vulnerable bits in any line) + e ≤ t
![Page 18: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/18.jpg)
Row Selection Covering problem
• Select enough good rows for each information bit di
• Until constraint is satisfied• NP-complete problem• Apply heuristics
H-row1 G - - - G - G G H-row2 - G B - B - - B H-row3 B - - B - B B - “good rows” – “bad rows” 1 1 -1 0 0 0 1 0-1 1 -1 -1 -1 -1 -1 -1
![Page 19: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/19.jpg)
Covering Problem Solve for cache line with maximum erasures first Apply solution to all other cache lines If unsatisfactory, add erasures from one of unsolved
lines Repeat until solution fits entire cache
![Page 20: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/20.jpg)
Implementation
+dp
cp
+dq
cq
+ds
cs
di
AdjustableThresholdVotercorrected
di
ctlp
&
ctlq
&
ctls
&
ctl
MajorityVoter
![Page 21: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/21.jpg)
Experimental Results
Results for Word Size of 256 Bits and Bit-Error Rate of 10‑3
Cache Size(Bytes)
Check bits for conventional OLS
Check bits for customized OLS Percentage
reduction in Max.
Check BitsAvg Max Avg Max
16 KB 155 224 117 145 35.2732 KB 166 256 125 148 42.1964 KB 175 256 134 156 39.06
128 KB 208 256 163 177 30.86
![Page 22: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/22.jpg)
Experimental ResultsResults for Constant Cache Size of 64KB
WordSize
(Bits)
Bit-error Rate
Check bits for conventional OLS
Check bits for customized OLS
Avg Max Avg Max
25610-3 175 256 138 15610-4 98 128 84 10710-5 66 102 64 68
48410-3 295 396 198 23010-4 143 176 117 13910-5 92 132 89 115
![Page 23: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/23.jpg)
Experimental Results64 KB cache, 484-bit word, 10-3 bit-error rate
![Page 24: Rudrajit Datta and Nur A. Touba Computer Engineering Research Center](https://reader036.vdocument.in/reader036/viewer/2022062310/56815dd8550346895dcc0290/html5/thumbnails/24.jpg)
Conclusion Post-manufacturing customization
• Reduces large check-bit overhead• Provides requisite reliability• Applicable to systems with high defect rate