Download - RTFT15 unit 2.pptx
-
7/25/2019 RTFT15 unit 2.pptx
1/53
UNIT 2
Fault ToleranceFault-Error-Failure.
Redundancy,
Error Detection,
Damage Confnement,
Error Recovery,
Fault Treatment,
Fault Prevention,
anticiated and
unanticiated Fault!.
Roll No" #$
Error model!
Error detection tec%ni&ue!
'atc%dog
Error control coding.
-
7/25/2019 RTFT15 unit 2.pptx
2/53
Fault
It i! !ome %y!ical de(ect t%at can cau!e a
comonent to mal(unction .
Fault can )e"
- *ard+are Fault eg" logical +ire
- o(t+are Fault eg" )ug
Real Time and Fault Tolerance
Defnition
-
7/25/2019 RTFT15 unit 2.pptx
3/53
Fault:
Fault i! a de(ect +it%in t%e !y!tem
E/amle!"0 o(t+are )ug
0 Random %ard+are (ault
0 1emory )it !tuc34
E/amle
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
4/53
Error
Error i! a deviation (rom t%e re&uired oeration o(
!y!tem or !u)!y!tem
5 6 (ault may lead to an error, i.e., error i! a
mec%ani!m )y +%ic% t%e (ault )ecome! aarent
5 Fault may !tay dormant (or a long time )e(ore it
mani(e!t! it!el( a! an error"
Real Time and Fault Tolerance
Defnition
-
7/25/2019 RTFT15 unit 2.pptx
5/53
Error:
1emory )it got !tuc3 )ut CPU doe! not acce!! t%i!
data
o(t+are )ug4 in a !u)routine i! not vi!i)le4 +%ile
t%e !u)routine i! not called
E/amle
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
6/53
Failure:
6 !y!tem (ailure occur! +%en t%e !y!tem (ail! toer(orm it! re&uired (unction
Pre!ence o( an error mig%t cau!e a +%ole !y!tem to
deviate (rom it! re&uired oeration
7ne o( t%e goal! o( !a(ety-critical !y!tem! i! t%at
error !%ould not re!ult in !y!tem (ailure
Defnition
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
7/53
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
8/53
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
9/53
Need of fault tolerance
Comle/ !y!tem
Critical alication!
*ar!% Environment
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
10/53
Redundancy
6ll (ault-tolerant tec%ni&ue! rely on e/tra element!
introduced into t%e !y!tem to detect 8 recover (rom(ault!
Comonent! are redundant a! t%ey are not re&uired
in a er(ect !y!tem
7(ten called rotective redundancy
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
11/53
6im" minimi!e redundancy +%ile ma/imi!ing
relia)ility, !u)9ect to t%e co!t and !i:e con!traint!
o( t%e !y!tem
'arning" t%e added comonent! inevita)ly
increa!e t%e comle/ity o( t%e overall !y!tem
T%i! it!el( can lead to le!! relia)le !y!tem!
E.g" fr!t launc% o( t%e !ace !%uttle
It i! advi!a)le to !earate out t%e (ault-tolerant
comonent! (rom t%e re!t o( t%e !y!temReal Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
12/53
T+o tye!" !tatic or ma!3ing and dynamic
redundancy
tatic" redundant comonent! are u!ed in!ide a!y!tem to %ide t%e e;ect! o( (ault!< e.g. Trile 1odular
Redundancy T1R = > identical !u)comonent! and ma9ority voting
circuit!< t%e outut! are comared and i( one di;er!(rom t%e ot%er t+o t%at outut i! ma!3ed out
Dynamic" redundancy !ulied in!ide a comonent+%ic% indicate! t%at t%e outut i! in error< rovide! anerror detection (acility< recovery mu!t )e rovided )yanot%er comonent
E.g. communication! c%ec3!um! and memory arity)it!
*ard+are Redundancy
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
13/53
y!tem i! rovided +it% di;erent !o(t+are ver!iono( ta!3
'ritten indeendently )y di;erent team o(
rogrammer!
I( one ver!ion o( ta!3 (ail under certain inut
anot%er ver!ion can )e u!edtatic"
N ver!ion
Recovery )loc3
Real Time and Fault Tolerance
o(t+are Redundancy
-
7/25/2019 RTFT15 unit 2.pptx
14/53
Four %a!e!
Error detection = no (ault tolerance !c%eme can )eutili!ed until t%e a!!ociated error i! detected
Damage confnement and a!!e!!ment = to +%ate/tent %a! t%e !y!tem )een corruted? T%e delay
)et+een a (ault occurring and t%e detection o( t%e errormean! erroneou! in(ormation could %ave !readt%roug%out t%e !y!tem
Error recovery = tec%ni&ue! !%ould aim to tran!(ormt%e corruted !y!tem into a !tate (rom +%ic% it cancontinue it! normal oeration
Fault Treatment and continued !ervice = an error i! a
!ymtom o( a (ault< alt%oug% damage reaired, t%eReal Time and Fault Tolerance
o(t+are Redundancy dynamic tye
-
7/25/2019 RTFT15 unit 2.pptx
15/53
T%e data are coded in !uc% a +ay t%at acertain num)er o( )it error can )e detected
or corrected
Real Time and Fault Tolerance
In(ormation Redundancy
-
7/25/2019 RTFT15 unit 2.pptx
16/53
Error Detecting Technique
Parity c%ec3ing
C%ec3!um error detection
Cyclic Redundancy c%ec3
Finding error in fr!t lace
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
17/53
Error Detection
7t%er tye! %eart)eat! etc.
Environmental detection
%ard+are = e.g. illegal in!truction
7.@RT = null ointer
6lication detection
Relication c%ec3!
Timing c%ec3! Rever!al c%ec3!
Coding c%ec3!
Rea!ona)lene!! c%ec3!
Types
Real Time and Fault Tolerance
-
7/25/2019 RTFT15 unit 2.pptx
18/53
Damage Connement and Assessment
Damage a!!e!!ment i! clo!ely related to damage
confnement tec%ni&ue! u!ed
Damage confnement i! concerned +it% !tructuring t%e!y!tem !o a! to minimi!e t%e damage cau!ed )y a
(aulty comonent al!o 3no+n a! fre+alling
1odular decomo!ition rovide! !tatic damageconfnement< allo+! data to Ao+ t%roug% +ell-defneat%+ay!
6tomic action! rovide! dynamic damage confnement