rtft15 unit 6
TRANSCRIPT
-
7/25/2019 RTFT15 Unit 6
1/20
UNIT 6Practical Systems for FaultTolerance:
Application:
Ad-hoc wireless network - Application:
NASA Remote Exploration &
Experimentation
Sstem Architect!re: "a!lt tolerant
comp!ters
#eneral p!rpose commercial sstems-
"a!lt tolerant m!ltiprocessor and $%SI
ased comm!nication architect!re'
"a!lt tolerant so(tware:)esi*n-N-+ersion Roll No: ,
-
7/25/2019 RTFT15 Unit 6
2/20
Ad hoc Wireless Networks
Ad-hoc network.
a %AN or other small network/ with wireless
connections
de+ices are part o( the network onl (or the
d!ration o( a comm!nications session
0r while in close proximit to the network
)ecentrali1ed
E'*': 2l!etooth
Real Time and "a!lt Tolerance
)e3nition
-
7/25/2019 RTFT15 Unit 6
3/20
Ad hoc Wireless Networks
The principle ehind ad hoc networkin* is m!lti-hop
relain* in which messa*es are sent (rom the
so!rce to the destination relain* thro!*h the
intermediate hops 4nodes5'
In m!lti-hop wireless networks/ comm!nication
etween two end nodes is carried o!t thro!*h a
n!mer o( intermediate nodes whose (!nction is to
rela in(ormation (rom one point to another'Real Time and "a!lt Tolerance
)e3nition
-
7/25/2019 RTFT15 Unit 6
4/20
A static strin* topolo* is an example o( s!chnetwork:
In the last (ew ears/ eorts ha+e een (oc!sed onm!lti-hop 7ad hoc7 networks/ in which relain* nodesare in *eneral moile/ and comm!nication needs areprimaril etween nodes within the same network'
Real Time and "a!lt Tolerance
8 , 9 ; 6
-
7/25/2019 RTFT15 Unit 6
5/20
=ilitar applications
Ad hoc wireless networks is !se(!l in estalishin*
comm!nication in a attle 3eld'
>ollaorati+e and )istri!ted >omp!tin*
A *ro!p o( people in a con(erence can share data in ad hoc
networks'
Streamin* o( m!ltimedia o?ects amon* the participatin*
nodes'
Emer*enc 0perations
Ad hoc wireless networks are !se(!l in emer*enc
operations s!ch as search and resc!e/ and crowd control'
Applications o( Ad hoc @ireless
Networks
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
6/20
=!st r!n in distri!ted en+ironment
m!st pro+ide loop-(ree ro!tes
m!st e ale to 3nd m!ltiple ro!tes
m!st estalish ro!tes !ickl m!st minimi1e o+erhead in its comm!nication B
reaction to topolo* chan*e
Iss!es in Crotocol )esi*n
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
7/20
Remote Exploration Experimentation
The *oal o( the Remote Exploration
&Experimentation 4REE5 is
to mo+e s!percomp!tin* into space in a costeecti+e manner
to allow the !se o( inexpensi+e/ state o( the art/
commercial-o-the-shel( 4>0TS5 components and
s!sstems in these space-ased
s!percomp!ters'
Introd!ction
somodD*mail'com
-
7/25/2019 RTFT15 Unit 6
8/20
Some of the responsi!ilities of the REE system softwareinclude:,' =ana*in* sstem reso!rces 4maintainin* state
in(ormation ao!t each node and ao!t the *loal
sstem/ per(ormin* sstem reso!rce dia*nostics/etc'5'
9' o sched!lin* 4*loall sched!lin* ?os across the
sstem/ local ?o sched!lin* within the node/
allocation o( reso!rces to ?os/ etc'5'
' =ana*in* the scienti3c applications 4la!nchin* the
applications/ monitorin* the applications (or (ail!re/
initiatin* reco+er (or applications/ etc'5Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
9/20
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
10/20
The immediate concern o( the Applications =ana*er is to
o+ersee the exec!tion o( the scienti3c applications'
As the applications represent the !ltimate Fc!stomerG o( the
REE en+ironment/ eHcientl s!pportin* their re!ired
dependailit le+el is paramo!nt'
The Applications =ana*er monitors the science application
(or externall +isile si*ns o( (a!lt eha+ior as well as (ormessa*es *enerated internall the applications
re!estin* (a!lt tolerance ser+ices
somodD*mail'com
-
7/25/2019 RTFT15 Unit 6
11/20
Fault tolerant computers
"a!lt-tolerant comp!tin* is the art and science o(
!ildin* comp!tin* sstems that contin!e to
operate satis(actoril in the presence o( (a!lts'
A (a!lt-tolerant sstem ma e ale to tolerate
one or more (a!lt-tpes incl!din*
i5 transient/ intermittent or permanent hardware
(a!lts/
ii5 so(tware and hardware desi*n errors/
iii5 operator errors/
i+5externall ind!ced !psets or phsical dama*e'Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
12/20
Fault"tolerant computer systemsare sstems
desi*ned aro!nd the concepts o( (a!lt tolerance'
In essence/ the m!st e ale to contin!e workin*
to a le+el o( satis(action in the presence o( (a!lts'
Fault"tolerant computer systems
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
13/20
concept!al desi*n o( a se*re*ated-component
(a!lt-tolerant comp!ter desi*n
somodD*mail'com
-
7/25/2019 RTFT15 Unit 6
14/20
Fault Tolerant #ultiprocessor
"a!lt tolerance is o(ten considered as a *ood additional
(eat!re (or m!ltiprocessor sstems !t nowadas it is
ecomin* an essential attri!te'
"a!lt tolerance can e achie+ed the !se o( dedicated
c!stomi1ed hardware that ma ha+e the disad+anta*e o(
lar*e cost'
Another approach to (a!lt tolerance is to exploit existin*
red!ndanc in m!ltiprocessor sstems +ia a task sched!lin*
so(tware strate* ased on time red!ndanc'
Introd!ction
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
15/20
N-$ersion Cro*rammin*
Reco+er 2lock Approach
Real Time and "a!lt Tolerance
"a!lt tolerant so(tware
-
7/25/2019 RTFT15 Unit 6
16/20
The N-+ersion so(tware concept attempts to parallel the
traditional hardware (a!lt tolerance concept o( N-wared!ndant hardware'
In an N-+ersion so(tware sstem/ each mod!le is madewith !p to N dierent implementations' Each +ariant
accomplishes the same task/ !t hope(!ll in a dierent
wa'
Each +ersion then s!mits its answer to +oter or
decider which determines the correct answer/ andReal Time and "a!lt Tolerance
N-$ersion Cro*rammin*
-
7/25/2019 RTFT15 Unit 6
17/20
This sstem can hope(!ll o+ercome the desi*n (a!ltspresent in most so(tware relin* !pon the desi*ndi+ersit concept'
An important distinction in N-+ersion so(tware is the(act that the sstem co!ld incl!de m!ltiple tpes o(hardware !sin* m!ltiple +ersions o( so(tware'
The *oal is to increase the di+ersit in order to a+oidcommon mode (ail!res'
Usin* N-+ersion so(tware/ it is enco!ra*ed that eachdierent +ersion e implemented in as di+erse amanner as possile/ incl!din* dierent tool sets/dierent pro*rammin* lan*!a*es/ and possildierent en+ironments
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
18/20
The reco+er lock operates with an ad?!dicator which
con3rms the res!lts o( +ario!s implementations o( thesame al*orithm'
In a sstem with reco+er locks/ the sstem +iew isroken down into (a!lt reco+erale locks'
The entire sstem is constr!cted o( these (a!lt tolerant
locks' Each lock contains at least a primar/
secondar/ and exceptional case code alon* with an
ad?!dicator Real Time and "a!lt Tolerance
Reco+er 2lock Approach
" lt t di
-
7/25/2019 RTFT15 Unit 6
19/20
Alternati+el/ i( oth inp!t e+ents m!st occ!r inorder (or the o!tp!t e+ent to occ!r/ then the are
connected an AN) *ate'
"i*!re , shows a simple (a!lt tree dia*ram inwhich either A or 2 m!st occ!r in order (or the
o!tp!t e+ent to occ!r' In this dia*ram/ the twoe+ents are connected to an 0R *ate
"a!lt tree dia*rams
Real Time and "a!lt Tolerance
-
7/25/2019 RTFT15 Unit 6
20/20
T$AN% &'(