detecting errors using multi-cycle invariance information nuno alves, jennifer dworak, and r. iris...
TRANSCRIPT
Detecting Errors Using Multi-Cycle Detecting Errors Using Multi-Cycle Invariance InformationInvariance Information
Nuno Alves, Jennifer Dworak, and R. Iris Bahar
Division of Engineering Brown University
Providence, RI 02912
Kundan Nepal
Electrical Engineering Dept.Bucknell University
Lewisburg, PA 17837
Design, Automation, and Test in Europe, April 20-24, 2009
MotivationMotivation
Errors in ICs are increasingErrors in ICs are increasing– Particle strikes, temperature, power, noise, Particle strikes, temperature, power, noise,
process variations, test escapes, etc.process variations, test escapes, etc.
Previously, we have proposed using Previously, we have proposed using logic logic implicationsimplications for online error detection during a for online error detection during a single clock cyclesingle clock cycle
What happens if we consider implications across time cycles?
OutlineOutline
Introduction & BackgroundIntroduction & Background
Logic Implications for Error DetectionLogic Implications for Error Detection
Multi-Cycle ImplicationsMulti-Cycle Implications
Experimental ResultsExperimental Results
ConclusionsConclusions
OutlineOutline
Introduction & BackgroundIntroduction & Background
Logic Implications for Error DetectionLogic Implications for Error Detection
Multi-Cycle ImplicationsMulti-Cycle Implications
Experimental ResultsExperimental Results
ConclusionsConclusions
Other WorkOther Work
Triple Modular RedundancyTriple Modular RedundancyLogic DuplicationLogic DuplicationRe-Execution in Multiple ThreadsRe-Execution in Multiple ThreadsCodes (Parity, Berger, Bose Lin, etc.)Codes (Parity, Berger, Bose Lin, etc.)High Level Fault AssertionsHigh Level Fault AssertionsFault MaskingFault MaskingChecking the Outputs Against a Subset of Checking the Outputs Against a Subset of the Truth Tablethe Truth Table
Our ApproachOur Approach
Find natural expected relationships and Find natural expected relationships and check for their violation.check for their violation.
Water should be blue….
Not brown…
In circuits, expected relationships at the gate level consist of logic implications.
OutlineOutline
Introduction & BackgroundIntroduction & Background
Logic Implications for Error DetectionLogic Implications for Error Detection
Multi-Cycle ImplicationsMulti-Cycle Implications
Experimental ResultsExperimental Results
ConclusionsConclusions
Implications Naturally Occur in CircuitsImplications Naturally Occur in Circuits
n1
n2n3
n4n5
n6n7
n80
1
00
n5 = 1 → n8 = 0
Implication Violations Can Be Used Implication Violations Can Be Used to Detect Errorsto Detect Errors
ERROR
n1
n2n3
n4n5
n6n7
n8
n5=1 n8=0
Appropriate checker logic can detect multiple errors with a single implication.
Implication Violations Can Be Used Implication Violations Can Be Used to Detect Errorsto Detect Errors
ERROR
n1
n2n3
n4n5
n6n7
n8
n5=1 n8=0
Appropriate checker logic can detect multiple errors with a single implication.
sa1
sa1
sa1sa1sa1
sa1
sa1
sa1
sa1
sa1
sa1sa1
sa1
sa1sa1
sa1
sa1sa1sa1
sa1
sa1
Total Number of Implications With Distance 2 or Greater
1
10
100
1000
10000
100000
Circuit
Nu
mb
er
of
Imp
lica
tion
s
So….what’s the problem?So….what’s the problem?
We have too many implications!We have too many implications!
How do we efficiently find them and which ones should we use?
Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically
…without functional knowledge of the circuit.
StartIdentify Potential
Implications w/ Simulation
Verify Implications
Eliminate Subsumed Implications
Determine Coverage of Remaining Implications
Select Best Subset for Target Error Detection
and OverheadEnd
What determines which faults an What determines which faults an implication may cover?implication may cover?
Potential Spatial Fault CoveragePotential Spatial Fault Coverage
Each implication can only cover a limited area of the Each implication can only cover a limited area of the circuit….circuit….
PQ
Direct Path
P
Q
P=0 → Q=0
Faults along the path may be detected
P Q
P Q
P=1 → Q=1
Faults along reconverging paths may be detected
Reconvergent Fanout
P
Q
Divergent Fanout
P
Q
Q=0 → P=0
Faults along paths to common ancestors may
be detected
Implications cannot cover any Implications cannot cover any sites sites downstreamdownstream of both of both
implication points!implication points!
Limitations of Single-Cycle Limitations of Single-Cycle ImplicationsImplications
Implications may not exist to cover faults Implications may not exist to cover faults far downstream—e.g. close to:far downstream—e.g. close to:– Flip-flopsFlip-flops– Primary OutputsPrimary Outputs
It is possible for no useful implications to It is possible for no useful implications to exist in a single cycleexist in a single cycle
Optimal timing of capture is difficultOptimal timing of capture is difficult
Many of these issues are alleviated if we consider multi-cycle implications
OutlineOutline
Introduction & BackgroundIntroduction & Background
Logic Implications for Error DetectionLogic Implications for Error Detection
Multi-Cycle ImplicationsMulti-Cycle Implications
Experimental ResultsExperimental Results
ConclusionsConclusions
Multi-Cycle ImplicationsMulti-Cycle Implications
A
B
X
Y
F
Sequential Circuit Containing No Non-Trivial
Implications in Combinational Logic
Time Frame Expansion B1
F1
X0Y0
A1X1
Y1
B2
X1
Y1
A2X2
Y2
F2
X2
Y2
Cycle t1 Cycle t2
Logic Value in First Clock Cycle Implies a Value at a Different Site
in the Second Clock Cycle
B1 = 0 → F2 = 0
Multi-Cycle Checker HardwareMulti-Cycle Checker Hardware
B1
F1
X0Y0
A1X1
Y1
B2
X1
Y1
A2X2
Y2
F2
X2
Y2
Cycle t1 Cycle t2
B1 = 0 → F2 = 0
A
B
X
Y
F
violation
Checker hardware requires state to be held between first and second cycle….
Spatial Coverage of Multi-Cycle Spatial Coverage of Multi-Cycle ImplicationsImplications
Good spatial coverage can be achieved near flip-flopsGood spatial coverage can be achieved near flip-flops
Logical distance may increase between implication sitesLogical distance may increase between implication sites
Delays captured at flip-flops in cycle t can be detected Delays captured at flip-flops in cycle t can be detected without complex timingwithout complex timing
P Q
Cycle t Cycle t + 1
Advantages:
OutlineOutline
Introduction & BackgroundIntroduction & Background
Logic Implications for Error DetectionLogic Implications for Error Detection
Multi-Cycle ImplicationsMulti-Cycle Implications
Experimental ResultsExperimental Results
ConclusionsConclusions
Experimental SetupExperimental Setup
ISCAS ’89 benchmark circuitsISCAS ’89 benchmark circuitsZchaff SAT solver to validate implicationsZchaff SAT solver to validate implicationsThree sets of implications per circuitThree sets of implications per circuit– First cycle First cycle
Both implication sites in cycle 1Both implication sites in cycle 1Obtained with single cycle analysis & unrestricted inputsObtained with single cycle analysis & unrestricted inputs
– Second cycle Second cycle Both implication sites in cycle 2Both implication sites in cycle 2Obtained with time frame expansionObtained with time frame expansion
– Cross cycle Cross cycle One site per cycleOne site per cycleObtained with time frame expansionObtained with time frame expansion
So, how many implications exist?So, how many implications exist?
Number of Implications in Each Class
0
5000
10000
15000
20000
25000
30000
Circuit
Num
ber
of I
mpl
icat
ions
1st cyclecross-cycle2nd cycle only
What is the distance between What is the distance between implication sites?implication sites?
Average Implication Distance for Single and Between Cycle Implications
0
2
4
6
8
10
12
14
Circuit
Ave
rag
e D
ista
nce
Average single cycledistanceAverage cross-cycledistance
How do the different How do the different implication classes compare implication classes compare
for error detection (if we use for error detection (if we use allall possiblepossible implications)? implications)?
Contribution of Different Implication Classes to Error Detection
0
10
20
30
40
50
60
70
80
90
100
s298 s420 s444 s510 s713 s953 s1196 s1488Circuit
Err
or C
over
age 1st cycle
1st and 2nd cycle cross cycleall
Developing a Compressed Developing a Compressed Implication SetImplication Set
Start
Choose next fault in fault list
Find implication with best coverage of this fault
Add best implication to compressed list
Any more faults?Return
implication listEnd
Yes No
Number of Compressed Implications
0
50
100
150
200
250
300
350
400
450
500
s298 s420 s444 s510 s713 s953 s1196 s1488
Circuit
Num
ber 1st cycle
cross-cycle2nd cycle only
What if we further tradeoff What if we further tradeoff error coverage for reduced error coverage for reduced
area overhead?area overhead?
Average Error Coverage Acheived for Different Area Thresholds
0
10
20
30
40
50
60
70
80
90
100
s298 s420 s444 s510 s713 s953 s1196 s1488Circuit
Ave
rag
e E
rror
Co
vera
ge
10%20%30%40%50%CompressedAll
Percentage of Cross-Cycle Implications Chosen for Different Area Overheads
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
80.00
90.00
100.00
s298 s420 s444 s510 s713 s953 s1196 s1488
Circuit
% o
f Cho
sen
Impl
icat
ions
that
are
C
ross
-Cyc
le
10%50%
OutlineOutline
Introduction & BackgroundIntroduction & Background
Logic Implications for Error DetectionLogic Implications for Error Detection
Multi-Cycle ImplicationsMulti-Cycle Implications
Experimental ResultsExperimental Results
ConclusionsConclusions
ConclusionsConclusions
Implications can be used to effectively detect Implications can be used to effectively detect many errors at runtimemany errors at runtime– Without requiring functional knowledge of the circuitWithout requiring functional knowledge of the circuit– Allowing tradeoffs to be made between error Allowing tradeoffs to be made between error
coverage and overheadcoverage and overhead
Cross-cycle implications cover faults that cannot Cross-cycle implications cover faults that cannot be covered by single cycle implicationsbe covered by single cycle implications
Even though they have larger overhead, cross Even though they have larger overhead, cross cycle implications are often an “optimal” choicecycle implications are often an “optimal” choice
When optimizing for low area overhead, more than 85% of the implications may be cross cycle
For Inquiring Minds
Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically
…without functional knowledge of the circuit.
StartIdentify Potential
Implications w/ Simulation
Verify Implications
Eliminate Subsumed Implications
Determine Coverage of Remaining Implications
Select Best Subset for Target Error Detection
and OverheadEnd
Run Good Circuit Simulation with Random Vectors and
Monitor Site Values…
0000 0101 1010 1111
A,BA,B
A,CA,C
A,DA,D
A=0 → C = 0
Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically
…without functional knowledge of the circuit.
StartIdentify Potential
Implications w/ Simulation
Verify Implications
Eliminate Subsumed Implications
Determine Coverage of Remaining Implications
Select Best Subset for Target Error Detection
and OverheadEnd
Using a SAT solver
(such as Zchaff)
Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically
…without functional knowledge of the circuit.
StartIdentify Potential
Implications w/ Simulation
Verify Implications
Eliminate Subsumed Implications
Determine Coverage of Remaining Implications
Select Best Subset for Target Error Detection
and OverheadEnd
n1
n2n3
n4n5
n6n7
n8n9
n10
n11
n12
n13
n10 = 0 → n13 = 0
n4 = 1 → n8 = 0
Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically
…without functional knowledge of the circuit.
StartIdentify Potential
Implications w/ Simulation
Verify Implications
Eliminate Subsumed Implications
Determine Coverage of Remaining Implications
Select Best Subset for Target Error Detection
and OverheadEnd
Of all the patterns that will allow a fault to produce an
error at an output, how many will each implication
detect?
Implication AlgorithmImplication AlgorithmGate-level implications can be found automatically
…without functional knowledge of the circuit.
StartIdentify Potential
Implications w/ Simulation
Verify Implications
Eliminate Subsumed Implications
Determine Coverage of Remaining Implications
Select Best Subset for Target Error Detection
and OverheadEnd