ccon - defense technical information · pdf filemastler of' operaz-ins .iesearc-veta 3...
TRANSCRIPT
DTIC_JUN ,I 51982
UNITED STATES AIR FORCE
AIR UNIVERSITY E
AIR FORCE INSTITUTE OF TECHNOLOGYWright-Patterson Air-force Base,Ohio
S 2 B 14Best Available Copy
CCON ?TDS'CE :IIITTERVAT-S- FOR SYST7,iREET BIL TmY ANTD AVA -A IL:: A OF"
.AINTAINED S7STEMS USINOI ,ONTE CARLO 7DBCHNIQIUES
THESIS
G;OR/.MA/81D-7 Mohamed S. 3. Hasaballait. Col. Exvztian Ar--Irw
Acprcvecý for public release; distribu't-on unlimi1ted
CONFIDENCE TINITERALS FOR SYSTEM RELIABIL TTY
AIND .'VAlLABlL=TY OFF MAI-N.TAINED SY'STEM"S
USITTG, MONTE CARLO TECHNIQUES
Presented t~the 7aculty of, the School c*-' Engin~eering
o,-: t're Air F:or-e -ns"it'.,,e of Technclogy
Air Uv-ýerz.tyj
in Partial '7ulfilIThent
Reýquirements fol- ,-hth D,ýýr.e c4"
MastLer of' OperaZ-ins .iesearc-
VeTA 3
MohamedO ''''Jo
SpD' Fas.abal-la
Lt. C ol. Egyplt.'.an Army
Gradua~'p ( ler ons Research
December 1981
Approcved fL~r pu>ý' rclea.se; distributiur unlAl. 1jt,ýd
Preface
With the steady increase in comp2exity of equipment,
n the stringency of operating conditions and in the positive
identif cation of system effectiveness recuirements, it is
becoming harder to satisfy the requirements; and more and
more emphasis is placed on preventive maintenance together
.th the speedy renn.por of pl at uii . L* as a means of
achievement. Test equipment is also becoming more and more
complex. Maintainability engineering is, therefore, an
attempt to achieve scme repair time objectives by specifyinga cor,.inat-or, o' desIn and human factors together wit, the
maintenance philosophy consistent with the other engineering
and cost constraints which exist.
The pi-pose of this study I. co present the confidence
intervals for system reliability and ava iability of main-
tained system using Mont Carlo techniques. The current lit-
erature indicates that Mont Carlo techniques offer the only
oractical method of analyzing large scale systems with general
failure and repair systems. it is hoped that this kind of
study will serve as a useful introduction to the var5.ous
types of problems which are of current interest.
I would like to thank my thesis Advisor, Professor
A. H. Moore, for his most valuable advice and guidance during
"this study, I would also like to thank Dr. joseph Cain,
my reader, for his help during the study. Also, i am
1i
grateful to Mrs. Barbara Folkeson for her help in typing
this thesis. I am also grateful to Mrs. Betty Marris and
Mrs. Mary Browning of the Air Force Institute of Technology
Library for their help in obtaining several references.
Finally I wish to reccgni%- the wonderful efforts of my wife,
Gawhra, who encouraged me to strive, to search, to study,
and, ultimately, to succeed.
Mohamed S. H. Hasaballa
li!i
TABLE OF CONTENTS
Page
Preface .................. .......................
List of Figures .............. ................. vi
List of Tables ........... .................. .. vii
Abstract viii
Chacrer
T. :ntroduction ............ .................
I'.r Survey of Technicues to Stu-ly Rieliability,.aintai.,abilty, and Availability of MaintainedSystems. .......... .................. .. L
?Reliability, Maintairab-ilit, and Ava211-abiit Def Ined .
The Basic Markovilan Fallure Models ....
Application of tuh Bas!M Yarkovlan Modelsto ReliabitLty .......... ............. 10
The Basic Markovian Repair Model. . .
Application of the Basic Markovian RepairModel to Malntainabillty. ........ 17
Specific Applications of Markov Processesto Reliability and Maintainability . ... 19
Redundant Systems ...........
Maintenance Systems .......... 38
Avallabilitv. ......... .............. .!
!I . Suxmmary of Point Estimates of Availability,l of- Mairtained Systems 51.....5
'omcu•-Iilon of Re!iability and Availabil!ity 5
. .. C - r•" e TA. . tF for Availab•..•ies ofMainta ied j stems, Exact Anai ytia! Methods 64
iv
Page
The Case of both Exponential Distributionzfor the Time-t-Fl and Time-to-Repair U.
Lowfer Confidence Limits Assuming Lowrncr-iallyDistributed Repair Times .... .......... .. 68
V. Yiont Cario Comparisons Confidence Limits forAvailability and Reliability .... .......... 73
Surmnary ............... ................... 73
Results . . . . . . . . . . . . . . . 75
Blioraph ............................ 78
Appendices ....................... 87
Appendix A - Major Reference Sources .. ....... .. 88
ApDendix B - General Reference Bibliogranhy ... 90
Acoendix C Supplemental Bibliography ... ...... 93
J t a . . . . . . . . . . . . . . . . . . . . . . . . Q9
7:t..............................................9
LIST OP FIGUFES
Figcure Page
1 Granh of the Basic Markovian Failure rModel
2 Graph of a Three-State Failure Model ......... 9
3 Graph of the Basic Markovian Repair Model . 16
4 n-th Extension of the Basic MarkovlanReoair Model .......... ................. ... 18
S Grap.. of Two-7 ..emet Non-Recalrable System 20
6 Graoh o' Two-Element RepalrabIe Sy.stem ... 2
Trans±iorzn Diagram for K-out-of-n System 28
8 Success Yodes for a Two-n1tt Standby Sysem 33
9 Failure State Diagram ....... ............ ... 35
0 ,va-. .labIty Graph for a Single ComponentSystem ................ .................... 42
ii Data Processor (#1) and Tape Units (*2) 46
12 Trar:siton-rat- Diazram for Computer System 47
13 Transition Diagrams for Two Component Systems
(a) Availability... ......... .............. 52
(b) Reliability ......... ............... 53
_ Availability Tran.sition Diagram Cor TwoComponent System with Two Repairmen (No Jo,'intEffort) ....................... ............. 56
15 Transition Diagrams for General System of TwoIdentical Components ...... ............. .. 57
1._ Mrkovan Rellabilty Model for Two ldentlcalParallel Elements and K Repairmen .. ...... 60
vI
LIST OF TAaLES
Table Page
3.1 System States fo- 2 Cnmporent -Ineparab• e
System . 51
3.2 Coefficients of / u' .w ... .......... 62
5.1 Results of the Double 'ont Carlo TechniqueExponental Failure and epair, "-Imes . . . 7 75
5.2 Results of the Double Mont Carlo TechniqueExponentia. Failure Time and LognormallyRepair T.me... ............ ................ 76
vl!~
ABSTRACT
This thesis presents the results of aSt extensive
literature search for finding the confidence intervals for
system reliability and availabIlity of mnIntained systems
using Mont Carlo techniques. The characteristics of system
reliabilIty and maintainability analysis are dIscussed. The
basic MIarkovlan failure ana repair models are developed. A
summary of the point estimates of ava'lab lity, relab-llt
of maintai-rd systems, and the exact analytical methods are
esent...d. 'Finally, the Bootstrap (Double Mont Carlo tech-
nique) is used to obtain the confidence ltmts for the avail-
ability of the maintained systems.
'.4i i
CONFIDENCE !NTERVALS FOR SYSTEM RELIABILITY
AND AVAILAEL•TYITY OF MAINTAINED SYSTEMS
USING MONT CARLO TECHNIQUES
I. introduction
The reliability and maintainability disciplines, as
the:.' exist today, has evolved primarily during the cast two
decades. 7.e impetus for these disciplines grew -arzely ouz
of military and space programs and was related to several
pr'ncpai considerations:
,a) the relatively high failure rates of equipment,
durinz the 1940's and early 1950's;
(b) the resultirn sharp increase in the cost of pro-
curement and maintenance;
(c' the continual increase in total oarts and func-
ticnal complexity; and
\d, the rco-gnized need to have an 'f't_'1-7,t _Fpproanh
to exclude or minimize the conditions that contributed to
fauity equipment.
Tn the evolution of the rellability and n--','..l
io'." dIsc1cZInes, many mathematical methodologies were devel-
oped to account for and explain the observations (failures
or -enars) generat•d by various equloments or systemts undzr
cons1.deratl.o, Sin e the coservations were (and cntinue
"to bel often random in nature, e. z , 3tocastic rather than1 • . . v ;.
II
deterninist-c, orobabllistic concepts were applied, to cor-
rectly analze the particular system. Since the Mont Carlo
technicues offer the best and valuable method of analyzing
larze scale s-:stems with general failure and re•air systems,
these technicues are used to calculate the confidence Inter-
vals for system reliability and availability of maintained
systems.
Rellabilty is a yardstick of the.. capabIl.ty of an
e-ulc-rr.ent tooperate without failures when put into service.
elIabili-.y predlcts mathematically the equipment's behavIor
under expecrt»; oce-ating conditions. More specifically, re-
il'biwiexcxresses in numbers the probability an equloment
will. onerate without failure for a given length of tIme in
an . . , r....r . •e ..... • it was * eA igne
Mi'intalnability is a more widely known term. Spe-
c.icallv, I7; is defined as the probabilIty that a failed
system can be made operable •n a specified interval of down-
time. Ma 4 ntenance actions can be classifiel into two cate-
gories. First, there is cff-schedule maIntenarnce necessi-
tated ty system in-service failure or maifunction. Its pur-
pose is to restore system operation as soon as possible by
replacing., repa.ring, or adjusting the component or compo-
nents which cause interruption of service. Second, there is
a scheduled maintenance at regular intervals; its purpose i.to keep the system In a condition consistent w•th Itsb
_n levels of cerformance, reliability, and, where applicable,
saey.
Availability is defined as the probability that a sys-
ter is operating satisfactorily at any poin.t in time. The
problem is to calculate the confidence intervals for systemriabiltv and availability of maIntained systems using
Mont Carlo techniques.
To achieve the study objectives, an extensive search
was made of the available literature. The results of this
search and the ensuing study are reported in the following
sectfons: Section Ii includes a survey of techniques to
study rela b ilIty , maintainability, and avallab 'ity of main-
tained systems; Section III contains a sumiary of point estI-
mates of availability and reliability of maintained systems;
Section IV contains the confidence limits for availabilities
of mainained systems - exact analytical methods; and final-
ly, Section V contains the Mont Carlo comparisons confidence
li-Its for avallabil ty and reliability. Appendix A contains
a list of major reference sources - that is, documents which
contained a majority of the references used in the study.
4pnendix 3 iq a reference bibliography containing literature
which is of a general nature, yet nertinent- to the second
objective of the thesis. This information has been included
as an aid to the reader for further study. Aopendix C Is a
sup'.emental bibliography containing two tyres of references:
those applicable• to the thesis but noc selected for detailed
analysis and those which appear to be applicable but were iot
readily obtainable.
Ii. Survey of Techniques to Study Reliability,
Maintainability, and Availability of
Maintained Systems
Reliability and maintainability engineering activi-
ties are performed for the purpose of determining and improv-
ing the reliability and maintainability of a particular sys-
tem. These activities often involve the analysis of physical
processes which have probabilistic transition events, in
which case the state of the orocess at any time t is a ran-
dom variable. Reliability and maintainability analyses, there-
fore, rely heavily upon the techniques of probability theory.
Reliability, Maintainability, and
Ayailability Defined
Reliability is the probability that a system will
perform its intended function satisfactorily for at least a
given time period under specified operating conditions. The
probabiIity of failure as a function of tLiae can be defined
by
p(t s t) = F(t), t ý 0 (2-1)
where t is a random variable denoting the failure time. Then
R(t) is the probability that the system will fail by time t.
if we define the reliability as the probability of success,
cr the probability that the system will perform its intended
function at a tIme t, we can write (Ref 43: 9)
1 I) P t >t 2 2
where R(t) is the reliability function. If the time to fail-
ure randor. variable t has a density function f(t), then
tR(t) = 1- F(t) = I- If(f)at = !f(T)aT (2-3)
0
Maintainability is a more widely known term. Spe-
cifically, maintainability is defined as the probability that
a failed system can be made operable in a specified interval
of downtime. Here the downtime includes the total time that
the system is out of service. Downtime Is a function of the
failure detection time, repair time, administrative time,
and the logistics time connected with the repair cycle.
Theoretically, for a product there exists a maintainability
function. The maintainability function describes probabilis-
tically how long a system remains in a failed state. Mathe-
matically
p(t 5 t) = m(t) (2-4)
where t is the total downtime.
Availability is defined as the probabilit, that a sys-
tem is operating satisf oily at any point in t and con-
siders only operating time and downtime, i.e., excluding the
idle time. Availability is a measure of the ratio of the
operating time of the system to the operating time plus down-
time. Thus it includes both reliability and maintainability.
,he relL•a_1ity and maintainability fields have
evolved as distinct disciplines since 1950. 3oth rely heav-
ily upon mathematical methodologies. Since they are concerned
*1
5e
with probabilistic concepts, many well-established mathemat-
ical concepts and techniques are applicable. One of" these
concepts is the theor:, of Markov processes. M~any of the
determinant-s of the reliability and maintainability of a sys-
tuem are random processes (sequence of' states), any given
state of which is dependent only upon the previous state in
the process. Such processes are, by definition, Markov pro-
cess; few reeecsthat present Markovr process theroy In-
clude applications of this technique to reliability and main-
tainability. Thus it is the purpose of this chapter to pro-
vide the Markov models for reliability and maintainability
analysis.
The Basic Markovian Failure Models
The basic Markovian failure model (Ref 30:18) consists
of two states and a single transition event. The system '-s
in state 0 if it is operating satisfactorily, and it is in
state I I~f it has failed. The transition event from state 0
to staite 1 is a system failure. Figure 1 shows the graph of
the basic Markovian model. At any point in time, if the sys-
tem is in state 0, it can either fail with pnrobability p.1
or not fail with probability p00 . System failure, state 1,
is considered to be the end of the process. Therefore, p,1
may either be ignored or set equal to I
The Markovian propertiet3 of the model result from the
dependence of the one-step transitfon probabilities on only
the inmmediatel', prrav~cns state. To amplify, the proballiit;
1711
Fig. 1
Graph of the Basic Markovian Failure Model ','rom Ref 13:18)
of the occurrence or nonoccurrence of failure in the time
interval, (t+At); it is assumed to depend only upon the sys-
tem being in the unfailed state at time t, regardless of the
past history of the system. The one-step transition proba-
bilities of the basic failure model depend upon and can be
derived from the failure characteristics of the system when
it is in state 0. The probability of a system failure in a
small time t is
I f(x)dxt
where f(x) is the probability density of the time to failure
cf the system. Since failure cannot occur unless the system
is operating, the conditional probability of failure in At,
given the system has not failed prior to t, is
t+At
I f(x)dxt (2-5)P O l cc
If(x)dxt
Equation (2-5) divided by At is the system failure rate (Ref
3:60)."
The hazard function is defined as the limit of the
failure rate as the interval approaches zero. Thus the haz-
ard function is the instantaneous failure rate:
h(t) =f(t)
also
f(t) = h(t) exp -h(-)a (2-7)0 i
Thus f(t), R(t), and h(t) are all related to each other and
any one implies the other two (Ref • 3:_).
Assuming stationary one-step transition probabilities
In the basic failure model, POI (qiAt), the failure rate
of the system is (qo0 .At)/At = Q,,, and h(t) = qOl. Letting
qCl X, if the time to failure is described by an exponen-
tial densit"y function, then
f(t) a e-t (2-8)
R(t) - !e-XTe Tt
-T
( t
i~e.,
and
f(t)R(t) (2-10)
Thus to achieve temporal homogeneity of the Markovian fail-
ure model, exponential failure times are usually assumed.
The basic Markovian model can be easily extended to
include system states other than total operability and total
failure. Figure 2 is the graph of such a 7jodel.
POO P 1 1 P 2 2
P02
Fig. 2
Graph of a Three-State Failure Model (frcm Ref 30:20)
State 0 and 2 are "system operating" and "system failed,"
respectively. State 1 is any state of system degradation
that changed the failure characteristics of the system but
does not cause total system failure. The transition event
from state 0 to state 1 is some form of system deterioration,
and the transition events 0 + 2 and 1 - 2 are the two differ-
ent modes of system failure that can occur. Because these
threeL tz- nsitiun events differ physically, it is usually the
9
F!
case that p0 1 ZP0 2 PI 2 Extensions of these basic failure
models to different system configurations require approximate
addition and redefinition of the possible system states and
their associated transition probabilities.
Application of the Basic Markovian
Models to Rell-ab--iity (-ef 30:21). System reliability,
R(t), is the probability that the system does not fail prior
to time t. in terms of the basic Markovian failure model,
Figure I, R(t) is the probability that the process does not
progress to state i prior t, time t for a contInucus-time
Markov failure model:
R(t) = 1- pl(t) :- Po(t) (2-l1)
Assuming stationary transition probabilities,
PO(t+•r) =p O(t) P po0 = PO(t)(I - qo 1 "AL)(2-12)
Po(t+At) = Po(t) - Po(t)qo].At
poct+tt) - o(t) -Pop(t)qo .0 At
t• 4- +o IAt = Po(t)qc~l
Using the definition of the derivatives of a function,
p'(t) = -qo! " PO(t) (2-13)
Laplace transforming of equation (2-13)
"•0'O -q0 PO 0S10 =•
Assunming p 0 (t 0 ) 1 and solving for p 0 (S)
PO (S) = (-1Ip0..C• - +q 0 1 (-a
and
r 1 *qlt (-5
L ![P 0 (s)] = p 0 (t) R(t) e (0-'5
for the corresponding discrete-time failure model
R(K) = p 0 (K) = p0 (p14 (2-16)
assuming stationary p,, and p0(O) = 1
R( = (1 - q 0 1 .At)K (2-17)
-he expression for the reliability of the three-state system
deascribed by the failure model in Figure 2 differs from that
cf the two-state model in that the number of different paths
resulting in system failure is increased. When the time
parameter of the model is continuous,
R(t) = p 0ot) + P1 (t) = 1 - P 2 (t) (2-18)
The probability P2(t) is determined as follows: from Figure
2 and the definition of Pij(At) (Ref 19)
(At)j II = ni (at) q ,2(At)
L 0 0 1
The n-
11
PO (t+At) p p0 (t)q 0 0 , At
-. o(t)[1 - (q0 .t + q~2 ~
o,(t+At) p0 (tY~')--t + r tqlL
-p0(t,)qo 1 .At + pD1 (t)( -
P (t+Lt) =pO(t)/q 'A U + P,(t)q2 'L + p(t)(2-19)
Taigderivatives,
po,'t) =-(qol + q0 2 )PO(t)
, t q0 p (t) -q 2 ,t
P 2 (t) qC*T2'0't) + cl2P 1(t) (-0
L~ap~lace trans~forming equation (2-20) and. assurn~ng p,(t.)=
and o It = 0, then
(s + c10 1 + q0 2\'00s) I
q0 1 p0 (s) - (s + q, 2 )F 1,(S\ =
q 0 2P0 (s) + q1 2PI(s) - sro.,(s) =0 (2-21)
By Cramer's Rule,
s+ c- ~02 0 1
-(s + q,.2 C
PS)q 0 2 q 1 2 0p2 (s s + q 1+q0 0 0
q1-(S + q 1 2 ) 0
q 0 2 q 12 -S
12
qoq`2,q02( +q1 (2-22)
P2 -5rs + q~Y + +ý
By partla2. ,,ractionls
B +
P2(s) s +s + q12 q2 +
q0q2 + q02 + )
A =- ( s + q,,)(ý5 'r qo q0 2 7
qCl 2+ q02 q12
B o - qo 1q 1 2 0 S + q2
+ q-', +q22
B s - -q12 +q12
+ ~ q 12'
,. obtain. the vaille of C we have
qo:Lql2 + '- 02 (s + q12)AB + 3c
......+ q 0 ( + 7 s $ + q 2 + q0
-- s-75-7 Bsos 02s++I + 'qc9 0
+ + (.9+ qs( j
A( 5 2 + sq0 1 + sq0 2 + sq.12
+q, 1 + a02 1 ) + 2
+ Bsqo1 + Bso0 2 + CS2 + Cs" 12
Taking the coefficients of s 2 in both sides
0 = As2 + 2 + 2
S A + B + C =0i
so C A + BN
q 0 1 + q2 -q 1 2
q-12 - q02 ] .
q0[ + q 0 2 - q!2
q1 2 - q 0 2
q1 2 - + q0 2 )
. 1, B - (q C -q! 2 + qO201+ -+12 q 002 - q 0 2'
(2-23'
pI (-q0) /q + q 0 2 - q 1 2 )2) s + q12
+ (-(q 1 2 t qO2)/ql2 - 02' • q 2S + a01 + q 0 2
[t 2 (s)] P2t) = 1 - B exp - q2t - C exp - (q 0 ! + q 0 9t
R(t) = 1 - p 2 (t)
R(t) - B exp - q 1 2 t + C exp - (q + q0 2 )t (2-214)
The reliability expression for the discrete-time model of
Figure 2 is
P0 (K) + p,(K) = - (K)
1P4
-~ -- == _ -- - _ _
where p 2 (K) is the sum of all possible paths that will cause
the ,stem to be in state 2 after the K-th step ocf the pro-
cess
p2 (K) a p 2 (O) + P (0')p 2 (K) + p 0 (0)P 0 2 (K) (2-25)
using the transition matrix method (Ref 19)
P2 (K)K) (2-2().
i.e., the third (last) element in the vector resulting from
t:,he multiplication of the initial-state probabilities vector
and the K-th step transition probability matrix. The relia-
bility expressions for more complex Markovian failure moaels
must be derived from the individaul model in terms of the
possible states and the failure characteristics of the sys-
tem being modeled. The general approacn, however, is the
same as shown above.
1he Basic Markovlan Reoair Model
The basic Markovian reoair model (Ref 3C:27) is shown
in Fure 3. -,The system states of this model are identical
to the states of the basic Markovlan failure model: state 0
if the system is operating and scate 1 if the system has
failed. The transition event of this model is any repair that
places a failed system in satisfactory operating condition.
From the failed state 1, the system can either be repaired
in t with crobabllity p., or not repaired in 6t with proba-
b ... y s,.... The repair process is completed when the system
Poo P!!
FIg. 3
Graph of the Basic Markovian Repair Model (fro- Ref 30:27)
is returned to operable condition. Thus, p0 0 = 1 the random
varlable, tine-to-repair of a particular system Is a funct-ion
of the repair characteristics (i.e., accessibillty, complex-
iby, 'u r .i i -urato. , etc ) of the s',, t. m and the -ch ........
tics of the repair activity (i.e., maintenance procedures,
non power, technical proficiency, etc.). The t-ansition prob-
ability, p0(t is derived from the probability density,
g't), of the system time-to-repair. When the repair-time
characteristics of the system remain unchanged regardless of
the past history of the system, the density, g(to, remains
unchanged and the repair model is Markovlan. The limit, as
t - 0, of the conditional probability that the system will
be reDaired in (t+6t) given that it is failed at t is the
instantaneous repair rate, analogous to the hazard rate of
the basic Markovian fallure model. To achieve stationary
p.,, exponential repair times are often assumed, in which
case, Pl, = w where q., Is a constant repair rate.
Because repair is the reverse process of failure,
except for the nature and direction of the transition events
as discussed above, the operation of the basic Markovian re-
pair model is similar to that of the basic Markovian model.
-- Application of the Basic Markovian
Repair Model to Maintainability (Ref 30:28). System
maintainability, M(t), is the probability that a failed sys-
is repaired within a specified time. In terms of the basic
Markovian repair model, Figure 3, M(t) is the probabillty
that the syste. transition from state 1 to state 0 within
the time interval (t 0 -t). For the continuous-time Markovian
repair model
M(t) = P 0 (t) = 1 - pl(t) (2-27)
Absuming p! 0 stationary and pl(tO) 1
N(t) = 1 - e(2-28)
The corresponding expression for M(t) of the basic discrete-
time repair model is
M(K) = 1 - pl(K) =J - (1 - ql 0 .t)• (2--29)
As it is used above, and consistent with the defini-
tion of maintainability, repair is corrective maintenance.
Each different mode of failure of the system requires a dif-
ferent repair action. Thus the basic Markovian repair model
can be extended to descrite the system at any desired level
of detaia. Figure t shows an n-th extension of the basic
17
'rL' ~ ~ ' __ ________ ____ ________.==
0 P0 Pl
?2)
n P22".nn
Fig. 4
n-th Extensicn of the Easic Markovian Repair Model
(from Ref 30:30)
Markovian. reair model. State 0 is the "system operating
satisfactorily" and states 1 through n represent system fall-
ure due to the possible modes -r malfunctions by which the
system can fail. It is reasonable t. assume that the time-
to-repair for each different mode of failure is different
and that P1. = P20= Pn0" For this model,,
nM(t) = D Pi(t0) p!)(t) (2-30)
Other extensions of the basic repair model depend upon sys-
tem configuration and system repair characteristics.
i8
Soecific Applications of Markov Processes
to Reliability and Maintainability
Numerous extensions and combinations of the basic Mar-
kovian failure and repair models were found during the liter-
ature search portion of this study. Representative models
showing the applicability of Markov process techniques to the
estimation of system reliabili'y and maintainability are pre-
sented in this section. Specific attention is given to Mar-
kov process applications in redundant systems and maintenance
systems. Examples of the combined failure and repair models
used to deter:mine the system availability are also discussed.
Redundant Systems. Shooman 'Ref 78) presents an exten-
sion of the basic Markovian failure model: a two element non-
repairable system. Figure 5 shows the graph of the system.
The author makes prov'ision in the model for failure hazards
which are a function of time, i.e., non-homogeneous. This
discussion, however, considers only the homogeneous model
with constant failure hazards, PJK= 2i'
The states of the system are defined as follows:
state 0 = both elements operating;
state 1 = element I failed, element 2 operating;
state 2 = element 2 failed, element 1 operating;
state 3 = both elements failed;
and the transitions may be described as follows:
= the failure rate associated with the •wo states
in question
19
~3a
Fig. 5
Graph of Two-Element Non-repairable System
XAt = the transition probability due to element fail-
ure in time interval Lt.
Using difference-differential equations procedures, the author
develos the following state probability expressions:
Po(t+At) -ri - (Al+ 2 )ItItPo(t) (2-D1)
P[lAt]pO(t) + [1- A 3 t] o(t) (2-32)
P 2 (t+At) = [3,2Lt]Po(t) + [1- YtItP 2 (t) (2-33)
't+At) = D .3•]i(t) + [4At]pp(t) + lp•(t) (2-34)
20
Th, solutions to these equations, with initial conditions
P(O) = ! and p,(O) py(O) = P3 (O) = 0, are
p0 (t) = e (2-35)
Pl(t) = 1 Fe-3 - e 1 (2-36)1 2 3 t
2 - e (2-37)P2+1 2 A1
p3(t) 2. - [p0 (t) + Pl(t) + p 2 (t)] (2-38)
For a two-element redundant (parallel) system, one failure
can be tolerated; and hence, there are three successful states:
Po(t), Pi(t), and p 2 (t). Since these states are mutually
exclusive, the expression for system reliability is
R(t) z po(t) + p!(t) + p2 (t)
or
R~ ) e-(X I + Xe2)t +' [e-X3 t I(k + X 2)t
R(t) =e + + kl X 1 [e+ 2 e3
+X 2 -X 4t - ( i + x2+ l+ f>- k4 e (2-39)
The author points out the complete generality of equa-
tion (2-39) and its usefullness in determining the reliabil-
ity of any two-element redundant system with constant hazard
elements (Ref 78:235). For example, if the hazard functions
are the same regardless o! tho stale of the system, i.e.,
21
I= x4 and X2 = 3) then
R(t) = e x + e e (X 2)t(2-40)
If the elements are identical and the failure rate with both
operating is Xb ard with a single element operating Ns, then
1 = x2 = xb and X3 = X4 & Xs which yields
-X t -2Xbt2•be b
R(t) - b s (2-41)2xb- s
Finally, i b X, then
R(t) = e-t (2 - e-t) (2.-42)
(Ref 43:217) shows that the expected time to system failure,
found by integrating equation (2-40) over the range of t, is
1 1 (2-43)Est = i-i+ - 2 -I+-
when all the units have the same failure rate k. Then equa-
tien (2-43) gives the nean time to failure for' a two compo-
nent system as
E (t) = 3/2X
s x 2
for a three component system, we have
Es(t) = !i/6x
and in genrai for n component •ystemr, the mean time to fail-
ure is
22
nE (t) z e/4. (2-44)s il
where 6 - Here it can be seen that the marginal gain
in the mean time to system failure decreases with each sub-
system added.
If equation (2-39) represents a two-element standby
system initially in state zero, then X, = x and X = 0A 2
since element 2 cannot fail prior to failure of element 1
because it is an unenergized state. Thus X3 = x B and X4 need
not be specified since if P0 (O) - 1 and X2 = 0, state 2 has
a probability which is zero. With these substitutions, equa-
tion (2-39) becomes
- Bt e•At
R(t) = (2-45)A- x B ýA- XB
in a similar application, Zorger (Ref 91) computes the relia-
bility of a "maintened" or repairable two-unit system. In
this model, the author assumes the failure rates are the same
regardless of the state of the system, e.g., X 1 - X and
2 = x 3" The repair rate of component i, ii, multiplied by
the time interval At gives the probability of transition due
to a repair. The graph of this system is shown in Figure 6.
Due to the fact that At is considered infinitesimal,
the probability of a double transition in at is understood
to be zero. System reliability is compcted on the basis of
the first time state 3 is reached; subsequent system repair
has no ef.ect on the reliability of the system. Hence,
23
,D 4
Fig. 6
Graph of Two-Element RF=pirable Syscem
p 3 2 (t) and p3 1 (t) are also considered to be zero. The author
shows that the reliability of the repairable system is
R(t) = 1 - L-p 0 (s) (2-46)
where
PO(s) - LSPo(S)/A(S)
3P(S) -- X! 2 A3 312s + xL + XID + 1j, + 12]
+ x2 A 22s2 + s(2X1 + X2 +
22
+ \+ XX, + +
2i
+ X AI[S' + s(% + 2> 2 + Ul) + X,21 C.
+ XIX2 + 1IX2 + i 1 Xl] + A0O[S3 +
+ s 2 (2X 1 + 2X2 + 1 +2 +s(
2 2 +X+ "2'2 + "1'1 1 2i+ + V2Xi + PlX2
+ 1 1 2) + AX 2X + X X2 +
+ Xl¾l'2 (2-47)
With A, representing the initial condition p (t 0 ) andii1
F= -s (s + X + 12)(s + X2 + V'N/(S + X + X2 )
U2 wX2 (s + X2 + Ul) - ll 1(s + XI + u2 )J (2-48)
letting AO , A = 1, A 0, 14 - 11 - u and X, =A0 0 A1 1 ~'22 '1 2 2.
X, the above expressions are greatly simplified
B(t)= .AeBt Be AtA- B (2-49)
where
A, B (-3X + v) ±--+ 6uV + v'T (2-50)
' 2
using equation (2-49), the author shcws that the maintained
system NTBF is (3X + u)/(2x 2 ).
Zorger makes no mention of system maintainability.
it can, however, be derived from the repairable system model
ty conslkierng only tne repair characteristics of the sys-
tem. Aszuming one repai.rman is available, the system is
25
repaired when either clement 1 or element 2 is operating.
Also, p 3 0 = 0 due to the infinitesimal probability of two
repairs in At. Hence,
M(t) = p 2 (t) + pl(t) - 1 - P 3 (t) (2-51)
By the methods used above,
P 3 (t+At) = P 3 (t)(i- .iat)(2 - i 2!t)
3 13 (ut + u2 At)P 3 (t) (2-52)
p'(t) = -(I. + uI 2 )p 3 (t) (2-53)
sP 3 (r) - A3 3 - -('l + u2)P3(s(2-54)
From the definition of maintainability, A3 3 = 1 and A2 2 =
Al • 0. Thus
sp 3 (S) - i = - + .i 2)p 3 (S)
i.e.,
P 3 (s)(s + I+ +2) 1
1
p 3 (s) = s + 1 + (2-55)3 S-+ Ij + u2)
p 3 (t) = e(Il + P t (2-56)
M(t) - 1 - e 1 2 (2-57)
Messinger (Ref 58) considers a "K-cut-of-h" redundant
ýt....... co-mosed uf identical eiements with constant
26
failure rates and no repair capability. The system will
operate when at least any K of its components are operating.
The states of the system are defined to be the number of
failed components; thus, the number of system states, n + 1,
increases linearly with the number of components, n, in the
structure. The component failure rate in the i-th state
(i.e., when i failures have occurred) is denoted by Xi. The
graph or transition diagram for this system is shown in
Figure 7. The reliability of the K-out-of-n structure is
computed by summing the probabilities of successful system
states 0,1.,2,...,n-k , i.e.,
n-KR(t) E p pi(t) (2-58)
i=0
The author points out that considerable computational effort
can be saved (especially if K is near n) by grouping all of
the system failed states into a single collective failed
state. (This is essentially equivalent to partitioning the
transition matrix into successful and unsuccessful states
as considered by Sandler (Ref 73:100).) This grouping is
permissible because the individual probabilities of being
in the failed states are Qf no interest in ca.culating
system reliability. Once the system enters a failed state,
it canno*, wIthcut repair, return to a good state. Thus,
the system can be modeled by n - K + 1 good states and one
collective falled state, a total of n - K + 2 states. using
a collective failed state, the transition diagram would
27
I 4-3
0
4.3)
* I1
r* cr
28I
appear as in Figure 7 with state (n-1) relabeled as (n-K)
and state (n) relabeled as (F) for the failed state. The
transition probabilities would also require appropriate
modification.
In the special case where the hazard rate is assumed
proportional to stress, i.e.,
S_ n •0 for i = 1,2,...,n-1 (2-59)i n - L "''
and the components are all assumed initially operational,
the author shows that the system reliability is
n-KR(t) = e t (2-60)
i=o
Reliability production formulas for standby redundant sys-
tems have been developed by numerous authors. Equation
(2-45) is an example of the reliability expression for a
two-element standby system. In general, the reliability of
a structure composed of N identical elements, of which N - 1
are in stanby (assuming exponential distributed failure times
and off-line failure rate equal to zero) is given by
R~) _Xt N (xt)JN Z (t (2-61)
J=0 jl
Benning (Ref 9) employs Markovian techniques to
derive the reliability expression for a structure involving
29
N identical components of which at least K (K < N) are re-
quired for system operation while N - K are in standby. The
author shows that
N- KR(t) e"K- "-K (Kt) (2-62) i
and that the mean life (ML) of this configuration is
ML (2-63)
For a similar K-out-of-N system 4n which all N components
a-e on-line, the mean life is
N-K ,ML= - (2-64)
i1 0 M 1
C...paring equation (2-63) and equaticn (2-64) term-by-term
shows that every term in equation (2-64) is less than or
equal to each term in equacion (2-63). This result, as the
author explains, shows that the K-out-of-N standby structure
is more reliable than the on-line structure providing switch-
ing is perfectly reliable.
Kapur (Ref 43:221) shows that for the perfect switch-
ing case Rn(t) - tne reliability function for a standby sys-s
tem that has n subsystems - is
Re- n (et) X /i. (2-65)i=0
wherc X is the failure rate and is identica! for eachn standb.y
30
unit. However, for larger k, the design of switcning cir-
cuitary would be quite complex if it were done automatically.
Hence, it appears that standby redundancy would be most use-
ful in those applications in which manual switching or re-
placement of components would be tolerated (Ref 13:137).
Enstein and Weinstock (Ref 28) consider a similar sys-
tem composed of N units of which m are on-line and N-m are
off-line (standby). Of the m on-line units, at least K
must be operating properly except for an "acceptable dcwn-
time" which is less than or equal to t. units of time. In
other words, even if less than K units are operational, the
system is still considered to be performing satisfactorily
as long as this condition persists for less than time to.
It is assumed that each element of the system has indepen-
dent exponential failure and repair times and that there are
N available repairmen. The authors consider the system to
be in either one of two states, the good state or the bad
state. If thp system has had at least one failure (less than
K units operational) for longer than to prior to the time
in question, it is considered to be in a bad state. "Con-
ditional availability" is defined by the authors as the prob-
ability of being in a particular state at time t conditioned
on the fact that the system is not in a bad state at time t.
Strinivason (Ref 8Q) applies a Markcv model to a stand-
by redundant system where the swltchover is not instantaneous.
Switcho.er time IS c ('-P tn be a random variable and
is accumulated from the Instant action is initiated to bring
31
the standby system to the active state, to the instant at
which the standby unit becomes operational. The author
assumes exponentially distributed failure times, an arbi-
trary repair distribution, and an arbitrary distribution for
the probability that switchover is completed in (0, t).
After deriving a lengthy equation for the probability dis-
tribution of first time to failure, the author then obtains
an explicit expression for the expected time to system fail-
ure. His results show that if cost factors are of no con-
sideration and if the objective is to have maximum MTBF,
"The best policy is to initiate switching action . " just
at the moment the subsystem becomes standby (Ref 80:176).
Kaour (Ref 43:221) considers the case of imperfect
sw.tchicg. . He first looked at a situation where the switch
simply fails to operate when called upon. The probability
that the switch performs when required is ps" For the two-
unit standby system, it was shown that
tR2(t)' = Rl(t) + ps ff(tl)R 2 (t - t )at 1 (2-66)
and for a three-unit system
R3(t = R 2 (t) + p 2r (2-67)
s s s'3
He considered that the switch is a complex piece of equip-
ment and has a constant failure rate of • Thus the re-s
liability function for the switch is
-t¾ (t) -A St s32
32
and the switch can fail before it is needed. When he con-
siders the two-unit standby system, the reliability at time
t is
Rs /U (t .5 t t s >t t 2 >= - t 1 )](2-68)
where tS is the random varlable representing time to switch
failure. Figure 8 shows the success models for a two-unit
standby system.
EI
• EE
o ( Eý
0
t t1 It
Time
Fig. 8
Success Modes for a Two-Unit Standby System (Ref 43:219)
Equation (2-68) becomes
t(t = R1 (t) + ifl(t )R (t )R( ( t - )t (2-69)
33
by substitution for R s(t )
2 / t -x tR R,(t) + ff,(t )e R2 (t - t 1 )•t 1 (2-70)
The author considers the special case where all subsystems
have a constant failure rate X, then equation (2-70) reduces
to
R2 (t) = e + - (l - e , t 2 0 (2-71)s Xs
He finally considers the case of three-unit standby systems
which have a constant failure rate X; he deduced
R(t) " = e EI + - - (1 - e s)]
-+ )[) -eXst
_xt - x te
+ e ( Xl Xs)2 e - ex
- s e, t a 0
(2-72)
Anderson (Ref 2) employs a Markov chain model to
analyze the reliability of a special class of redundant
systems. These systems include those which operate in a
standby mode for a long period of time in anticipation of
participation in a single mission. The system consists of
N modules of which q are active on-line spares. The system
is in operating condition when at least N - q modules are
operating; the system has failed when more than q modules
have failed. The following assumptions are made:
(1) The module rate (X) iS constant.
34~
(2) The module repair rate (p) is also constant and
independent of the number of failed modules (one module is
repaired at a time).
(3) Repairs may be made during the standby interval
but not during the mission interval.
(4) The system has been in the standby mode sufficient-
ly long to Insure its being in a steady-state condition,
with respect to reliability, when it enters the mission
interval.
The failure state diagram for this system is shown in Figure
9. The state number corresponds to the number of fdiled
modules. Transitions from state k to k + I occur at faii.-
ure rate XK and transition from state k to k - . occur atKL
repair rate uK.
•0 I 2 x k
U1 W2 ý13 Pq Uq-i UN
Threshold ofSystem failure
Fig. 9
Failure State Diagram (from Ref 2:22)
35
The general differential equation for this system is
the same as those developed by McGregor (Ref 57). Anderson
manipulates these general equations to determine the prob-
ability of' mission success, Pms' and the expected downtime
per year, DT/Yr, for the assumptions and constraints (nota-
bly the initial conditions) of the given system. Since
there is no recair during the mission interval, the proba-
bility of mission success is simply the sum of the individ-
ual probabilities of being in an unfailed state, i.e.,
qE PK(tm) (2-73)I K=o.P
The solution for pK(trm) involves the solution of the set
of a..multaneous differential equations which apply during
the mission interval. The author derives an explicit equa-
tion for p (tm) which is expressed in matrix form as
p 0 (tm) ý00 0 0 ... 0 0 (tm
Pl(tm) 010 .ll 0 0 IPl(tm r 0)p 2 (t-) 420 12 1 •22 "'" 0 P (tm = 0)
n(t i n0 nl n2 '"nn Pn m
where
36S
k-1 K -(n - r)xt mRKi = f(n- x1) z e (2-74)
x~i r~iT (r - y)
Applying these equations to a two-element redundant system
with module failure rate X, the author obtains
-2xtP0 (tm) e mp( 0 ) (2-75)
[r-Xt -2tm -Xtm(tp'(t ) 2 2e P0 (O) + e (2-76)
[ -Xt -2ýtm
P 2 (tm) - 2 e m + e P0(0)
+ - xtm] D( 0 ) + 1P2 (O) (2-77)
The probability of the system being available, PA'
(i.e., above the defined threshold of system failure) upon
reaching the steady state condition during the standby inter-
val, is given by
q A= ) (2-78)
K=O
The probability of being in state K at the end of the stand-
by interval is
PK(ts n for K = 0,1,...,n
(n K-
(n -J - (2-79)J 0
r,4ally, the expected system downtime per year is given by
37
0
DT/yr 8760(1 - PA) hours/year (2-80)
The author alc presents two sets of curves; one set
shows the probability of mission failure as a function of n,
MTBF/MTTR (mean time to repair) and t /MTEF fcr different
values of q. The second set of curves give the expected sys-
tem donwtime per year as a function of n, q, and MTBF/MTTR.
These curves are helpful in gaining insight regarding the
tradeoffs among the various system parameters. Several other
examples of Markov process applications to the reliability
analysis of redundant systems were found in the literature.
Ir most cases, these additional applications are extensions
or modifications of the examples already presented. The
interested reader may refer to Sandler (Ref 73) for an exten-
sive general discussion of the application of Markov proces-
ses to the reliability analysis of various system configura-
tions.
Maintenance Systems. Brosh (Ref 16) utilizes a Mar-
kovian model to analyze a multi-service maintenance system
with "first come, first served" repair (queue) discipline.
The system is inspected at fixed periods of time, classi-
fied into one of a finite number of decisions. The state
of the system is defined by the number of machines in the
system, i = 0,,.1..,N, where N is the maximum number of ma-
chines the system can contain. There is only one maintenance
crew. However, repairs may be performed in two different
ways. This results in two different probab'lities of
38
completing maintenance in a unit of time. A set of costs is
associated with being in each state and using each type of
service. An additional cost is incurred when a machine ar-
rives for service and the system is full. The author presents
a procedure for choosing the optimal policy for controlling
the system by deriving the steady-state transition probabil-
ities. Maintenance policies are considered by partitioning
the state zpace Into two sets - one containing those states
in which the unit is maintained with service type two. The
measure of effectiveness is the expected average cost per
observation period over an indefinite sequence of observa-
tions. Analysis of the maintenance policy space reveals the
interrelationship betwe,1 system variables and optiz.al main-
tenance policies; e.g., disconnected policies arp driinated
by connected policies except for one case (c2)1(c)11 (12/wl)
and (c 3 /c 1 ) =()/(I) for which any policy chosen will yield
the same value (cl, (X)/(ul)) for the cost function (Ref 16:78).
Natarajan (Ref 64) considers another important aspect
of the system maintenance problem - that o.i assigning repair
priority to a particular type of component. The author pre-
sents a Markovian characterization of an anti-aircraft system
consisting of redundant radars working in conjunction with
redundant computers. System failure occurs only when both
radars or both computers are in a failed condition. As soon
as any component fails, repairs are initiated. If, in the
interim, a component of the other parallel system fails, it
must either wait for repairs or it can preempt tne unit in
39
_________________________________ ____________________________________________________________
repair. The first case is known as first-in, first-out (FIPO)
discipline; the second case is known as preemptive di4scipline.
The author assumes exponential distributions for the compo-
nent failure rates (X and X,2 and repair rates (wI and v2).
He derives somewhat lengthy expressions for mean time to
systeim failure for both the FIFO repair policy and the pre-
emptive policy. When no repairs are permitted, the mean time
to system failure is
1X {5 2X1 - ^2 ,2-81)
E(T) = ( + ) -Ii + I 2 22( +2X + X
The author discusses the problem of optimal priority
assignments to which component ana what priority and compares
variatlons exhibited by the mean time to system failure with
a priority discipline imposed on the repair process of dif-
ferent components. Numerical results provide the author with
the following conclusions:
(i) FIFO discipline for repair of priority components
is more effective than a preemptive disciplIne when > 2
and v 1 >2 or when XI < X 2 and v < P2" Hence, the pre-
emptive disciplines should be used only under special circum-
stances governed by emerging or risk considerations.
(2) When a preemptive discipline is imposed on the
repair process, higher mean time to system failure will re-
sult If > X, and v < i2 (Raf 64:107).1 2
As a final example of the use of Markov processes in
a maintenance environment, Eckles (Ref 26) discusses a system
40
that deteriorates stocastically and is further complicated
by the fact that the state of the system is unobservable
unless an inspection is performed. The author develops a
model to determine the appropriate sequence of actions (re-,
placements, repair:, Inspections, etc.) that minimizes the
total cost of operation (including such factors as downtime,
inef.ficiency, and salvage value). Maintenance of the system
is characterized by a discrete-parameter, non-stationary
Markov process. Prior to each transition, the decision maker
selects one of a finite number of available actions. The
action selected and the system's age determine the subsequent
one-step transition probabilities and the conditional (on the
syst-m states) distribution of the next measurements. Cost
iz dependent on the action taken and on the system's state
assigned to each possible transition. The author shows that
the action that minimizes the discounted value of expected
immediate and future costs (assuming optimum future actio:.z)
is determined by the system's age and the posterior distri-
bution over the states (Ref 26:16). Optimum maintenance
policies can then be calculated using a dynamic programming
method.
Availability. The expression for system availability,
A(t), integrates the probabilities of reliability and main-
tainability. A(t) may be defined as the probability that
a predicted percentage of operations of time duration T will
not have any malfunctions which cannot, through maintainability,
41
I
be restored to service within the permissible repair time
constraint (Ref 5:8). Figure 10 shows the basic availabil-
ity graph for a single component system. The differential
equations for each state probability may be written directly
from the figure (Ref 58:33).
1!- Xat J.- ujAt
uAtj
Fig. 10
Availability Graph for a Single Component System
•Ot, -i po(t)l
IL (2-82)
Pi(t - Pl(t
A'suming the initial conditions p0 0) = 1 and pl(1) = 0, the
solution of equation (2-82) yields for state probabilities
PO +
-(+ + u)t.•(t) = 1- p 0 (t) = •k. - " eU-~t POt A (2-84)
42
The availability of the single component is the probability
that the system is in state zero, i.e., A(t) - p 0 (t). For
"large t, the availability approaches a steady state value
Ass = lim A) (2-85)t + A+
noting that X - !/(MTBF) and i (!)/(MTTR), As can be re-
written as
MTBF (2-86)Ass MTBF + MTTR
Kapur (Ref 43:227) attained the same results for the
availability as the above equations by stating that for the
exponential distribution and for some interval of time Lt,
he stated
posystem failure during At] = XAt (2-87)
and
p[repair during at/system failure] = w4 t (2-88)
He defines the availability function, A(t), by using equa-
tions (2-87) and (2-88) as follows
A(t+At) = A(t)(' - XAt) + [1 - A(t)]pAt
= A(t) - XA(t)At + IiAt - ijA(t)At
or
A(t+At) - A(t) = -(X + p)A(t) + vAt
43
Taking the limit as Lt 0
a A(t) -(x + v)A(t) +Dt
A'(t) = -(X + i,)A(t) + i.'
which is a d.fferential equation whose solution is
U + X e-( + •)A(t) - T- T V + U T X
A lim A(t) = Ut -) X +
Mean repair rateMean failure rate + Mean repair rate
or equivalently
Mean time to failure
A - - 7ean time to repair + Mean time to failure
(2-90)
He stated that, fortunately, it so happens that if the down-
time, p. d. f., is taken as something other than exponential,
the same steady state solution results. in practice, fre-
quently the log normal is used as the downtime, p. d. f.
Kapur defined the intrinsic availability in the sa-me way as
the availability. The only difference will ba that the mean
repair rate in the availability function will be replaced by
the mean active repair rate. Thus the steady state intrin-
sic availability is given by
Mean time to failureI ~Mean active repair time + Mean trime to failure
(2-91)
44
Dayon (Ref 23) utilizes the steady-state availability
concept to analyze a computer system consisting of a data
processor and tape units. The purpose of the analjsis is to
solve for the MTTR of the redundant system. The author
points out that defining the system states and formulating
the appropriate system steady-state availability transition
rate diagram is the step requiring the greatest a-gree of
ingenuity and expertise. By contrast, subsequent steps to
obtain a numeilcal solution for the system MITTR involves
only routine mathematical manipulations (Ref 23:153). The
system under discussion consists of a data processor (unit
#42 requiring two tape units (units #2) for data storage.
A third tape unit is on standby redundancy. A block diagram
of the system is shown in Figure 11. Implied in the model,
but not explicitly stated, are the assumptions of a Markovian
tape system with con3tant element failure and repair rates.
Other features of the system include:
(1) All three tape units are identical.
(2) Any tape unit can be removed off-line and repaired
while the system is energized.
(3) All units are de-energized when the system enters
a fall state.
(4) Repair is performed on a FIFO basis.
(5) The system is re-energized and operated as soon
as there are enough units repaired to have full system capa-
(6) There is one repairman.
45
* ~ # _ - -'.
Fig. U1
Data Processor (#1) and Tape Units (#2)
The assumed failure and repair rates for the respec-
tive units are:
X! = 0.01/hour 1i 1.0/hour
x 2 = 0.02/hour P2 - 2.0/hour
The states of the system are defined as:
State 0: all the units up
State 1: one tape unit down
State f 1: two tape units down
State f2: one tape and the data processor down
State f : data processor down
The steady state availability transition-rate diagram
for this system is shown in Figure 12. The loops showing
the uroba0ility of remaining in each state have been omitted
22
- 3l
Fig. 12
Transition-rate Diagram for Computer System
to enhance clarity. The repair transition from state f2, to
state f3 is indicative of the FIFO repair discipline. if
the policy had been to resume operation as soon as possible,
the repairman would have ceased work on the tape unit when
the system entered f2 and performed maintenance on the data
processor until the system was returned to state 1. The
author develops the availability matrix and substitutes the
appropriate failure and repair rates. Numerical solution
yields (after "normalizing" the state probabilities) a
value for MTTR equal to 1.46 hours.
47 .
11! 1 1 P
A-0.5 2.0 0 1.0 PI
0.04 -2.05 2.0 0 0 Pf (2-92)
0 0 0.04 -2.0 0 0 Pf 2l
0 0 0.01 0 -2.0 0 pf
Grace (Ref 34) discusses the evaluation of steady-state
system availability when there are a limited number of re-
pairable spares for the various types of on-line units. The
particular system considered consists of five on-line vnits
and two spares of a given type, with one maintenance man
(associated with a replacement rate 6) and one repairman
(associated with a repair rate u) assigned to the unit type.
Failure, repair, and replacement times are assumed to be
exponentially distributed. Hence, the availability of I units
of type a, Aail can be determined from a Markov model which
includes all possible states and transitions of the on-line
and spare units of type a. The author includes a complete
transition diagram (33 states) and discusses a procedure for
writing the resulting equations directly from the transition
diagram. For this system, the steady-state probability RiJ
of being in state iJ is obtained by solving the set of equa-
tions:
(nX + sa)l 0 0 J i0l + 6fi, -I
[nx + (s - 0)s]f0 1 = u 02 + 611, 0 (2-93)
etc.
48
Where n is the number of on-line positions, a is the off-line
(spare) failure rate, and s is the number of spares. The
normalizing condition
Z IT 1 (2-94)i ,j
is required for a unique solution. The steady-state proba-
bility that i on-line units are not operating is given by
,= ii (2-95)
if i units have failed, the probability that any 1 units are
good is
(I Z (2-96)
(n) (n)The author shows that the total probability that any 1 par-
ticular units are working is
Aa. I=p Et Z n (2-97)
Schick (Ref 76) employs a Markov process to determine
the availability or "operational readiness" of a system which
is subject to inspection and repair. Except for inspection
and repair periods, the system is kept in a normal mode from
which it is called if its operation is required. Failure of a
primary part causes immediate shutdown, inspection, and
4~9
repair. Failure during checkout is detectable only at peri-
odic inspections while failure of some equipment is liable
to two types of failure. Examples of such systems include
missile launch facilities and computer repair systems.
Grippo (Ref 37) also employs a Markov process to
evaluate the reliability and availability of large, complex
systems. The author's approach enables the analyst to intro-
duce system operation or maintenance constraints without
adding appreciably to the complexity of the solution. Al-
gorithms are derived for obtaining computer-aided transient
and steady-state solutions to both reliability and availa-
bility problems. The author also presents a unique method
for integrating the differential equations that can result
from a Markov process.
Sasaki (Ref 74) utilizes implied Markov process tech-
niques to develop a set of charts and nomographs which can
be used to evaluate trade-off characteristics between system
reliability and maintainability. The author presents a slide
methods (using the charts and nomographs) of achieving desired
values of system (mission) availability for duplex parallel
redundancy and duplex switch over redundancy.
Finally, Hevesh and Harrahy (Ref 38) discuss the
effect of failure on the availability or "readiness" of phased,
array radar systems under different structural and repair
caoacitv conditions. The author also developz . xpressions
for the availability of parallel operated equipment, whether
truly or quasi-redundant.
50
| !INMM
III. Summary of Point Estimates of
Availability, Reliability of
Maintained Systems
Computation of Reliability
and Availability
The techniques for obtaining the reliability and avail-
ability of a repairable structure with state dependent hazards
and repair rates is best illustrated by an example (Ref 58).
Consider the following two components parallel system with:
components xI and x
components hazards XI' X2
repair rates ul' 12
number of reoairmen: 1
In general, five states are needed as tabulated in Table 3.1.
Table 3.1
System States for 2 Component Repairable System
Good States Bad States
0 - x1, x2 3 - !, 2 (x1 failed before x
1 - XI X2 - ½2 1 (x 2 failed before xI)
2 - xl, x 2
The transition diagrams for calculating the system reliabil-
ity and availability are given in Figure 13. The distinction
between the availability and reliability transition diagrams
Is that the availabilitv transition diagram repairs are
51, • 6
I-
ca
4o ~0
"< • 0
" "• 0
.oN7 E-i .-
• " 52
Ix-41
0
Q.)
En0)
41)
ccli
V2
17
(-V'
N - CI X.
-53
allowed from all system states, while for the reliability
tradrsition diagram repairs are allowed only from the gocd sys-
tem states. Therefore, summing the good state probabilities
obtained from the availability diagram gives the probability
that the system is good at time t, while summing the good
state probabilities obtained from the reliability transition
diagram gives the probability that the system has never en-
tered a failed state. System states that distinguish the or-
der in whlch component failures occur are necessary for the
availability model since there is only one repairman, and
according to the repair policy, he will worn on the component
that fails first. If the system is in state number three (Ri'
x2), they, the repairman is working on component x: and the
next transition must be to state number two (x,, x2 ). If the
system is in state number four (x 2 s x 1 ), then the repairman
is working on- component x 2 and the next transition will be
to state one (R2l, x 2 ). Ordering of the component failures
is not necessary iui the reliability transitlon diagram since
once the system enters a state with more than one railie, no
repair is attempted.
The differential equations for the state probabi±ities
are given for availability analysis by
54
[P(2) + 1 0 0 P0 (t)
(t2 + ) '0p (t)p/(t) - A2 0 (A: + u 2 ) I 0 p2 (t)
p3(t)i 0 A. 0 -'u 0 P 3 (t't
. " ( t4( : W 0 - X u tj L "- 2 P4(t)
(3. la)
where
A(t) p 0 (t) + p (t) + P2 (t) (3.1b)
and for the reliability analysis by
FPOt F 1 + 2) 2 J0P"(t) -(A2 + U) 0 0 0 1 (t)
A. t 0 1+(t) + P 2 ) (3 .2b)P 3(t)020p ~ ( ) 0A 2 0 0 ) P 3 ( t )P'4 tj 0 0 01 p 4 tJ
(3.2a)
e ee
R(t) =p(t) +0 p(t) + p2 (t) (3.2b)
Ordering would not have been necessary if there weretwo repairmen. The availability transition diagram is givenIn Figure !1. The reliability transition diagram remainsunchanged.
Tf the components are identical (x =2 X2 = ) and2= U), then states 1. and 2 can be combined and states
3 and 4 can be combined resulting in a greatly simplified
55!
vQ~ 0iE-
0 -
- E- 0
00
56 O
transition diagram as in Figure 15.
I - X'A• 1 -I x + 'A tI
C 2
(a) Availability
I - Vxt1- t +Xt0 XA t x At
0 1 2failes ure failures
(b) Reliability
Fig. 15
Trar,,itioi' Diagram3 for General System of Two identical Components
It should be noted that for the case of two repairmen,
when one component is failed one has the option of assigning
only one of the lepairmen or assigning both repairmen permit-
ting joint effort. When a Joint effort is permitted, the
repair rate •i'= av. Sandler (Ref 73) uses 3 = 1.5.
The differential equatlons for system availability
is, therefore, given b!
5 7
0[PO
p•( 0 M -- " P(t _
where
A(t) p 0 (t) + p 1 (t) (3-3b)
and the differential equations for a system reliability is
given by
"P0' 0 P0O)
Plt 1• - • + L,/) 0 1P t) (3-4a)
#2tJ0 P2(t)
where
R(t) = po(t) + pl(t) (3-4b)
Equation (3-3) is readily solved for the system availability
yielding 2_=r ]A(t) = 1 - X - XX1 rt('
e 2 •_
r1r2 r1 r r r
where r1 and r 2 are the roots of the equation
r + + + X' + /'+ ,)r + (xx' I Vpi"+ W'•/) =0
where
r- + r 2 -(X + V+ •'+ u")
r = r X + xV'U+ (3-6)
58
-A -i
The steady state availability is given by
A = rn A(t) I iss t r, r r 2
Xu"+ (3-7)=xx' + x'W"+ •'•"'
Equation (3--4) is readily solved for system reLiability
ytelding
R(t) - r 3 r 3e -r ite3 (3-8)
where r3 and r4 are the 6olutions of
r 2 + (X + X'+ W r + 0 (3-9)
The roots r, and r4 are clearly negative real since the dis-
criminant of e-uation (3-9) satisfies
b2 - 4ac = + X 2 4 x
X (x - I) 2 + 2'( X + x') + •,2 > 0
Shoorman (Ref 73) gave the solution for the two identi-
cal paralle elements ad K reoairmen. He derived the fol-
lowIng equations with respect to Figure 15. The differential
equations associated with Figure 16 are
PsD" (t) + "- 5s (t) -
"P t) + OL + x)psI(t) =,si - - 10 •
r(0) C (3)3
5A*
1- A t 1t (x + A~t
XAt XAt
s o sI -2 + Xl12 1S2 = 2 12
,' 2X for an ordinary system
X for a standby system
iiu for one repairman
w'= Kv for more than one repairman (K > 1)
Fig. 16
Markovian Reliability Model for Two Identical
Parallel Elements and K Repairmen
Taking the laplace transform of this set of eqlations yields
(s + X')p sO(s) - *sp(s) = 1
-'psO(s) + (s + .,'+ X)PsI(s) = 0(3-11)
-XP ,(s) + sP 2 (s) = 0
Solving the set of equations of (3-11) using Cramer's Rule
yields
(s + X + L')P so(s) s 2 + (X + )/+ uOs + xx'
Ps(s)= 2 + (\ ' (3-12)
S e SLS 2 + ( + X, + 'jOs + XX,]
Soi,'ution for the roots of the denominator quadratic yields
60
r (+ + + + 4
Expanding equation (3-12) in partial fractions
P(s) = (s - rl)(s r
12-
i.e.,
(A + •'+ rI)/(r 1 - r 2 ) (I + u + r 2 )/(r2 - r 2 02
Pso (s - r 1 ) + (s - r')
(3-14)
__ _ __ _ _(rlr2) rX'_(r2 r_,P(s) = (s - r,)(s r2) s 1 + s - r2
ps (S) = ~ - A (3-15
= X /r r 2 XX\/rl(rl - r 2 ) XX/r 2 (r 2 - r1 )1 + 1- 1 + 2 .2s s - r 1 s - r 2
(3-16) "°,
As a check one can sum equations (3-14) to (3-36) noting that
r XX and + ÷ r -(x + X'+ j'). The result is
PsO(s) + Psl(s) + Ps 2 (s) -- (3-17)
Taking the inverse transforms, one obtains
Ps 0 (t) + psi(t) + Ps 2 t = s2
The inverse transforms of equations (3-14) to (3-!6) yield
- + v + r1 rt + -/+ r2 r2t-
P2 r1 -2 - -1 '
61
r rlt X r 2t
PSI~ = r1 r e - r - r e (3-19) I
ps 2 (t) = 1 + r 2 et rl 2 e 2 (3-20)
From equation (3-13) r and r are always negative real num- "kbets; therefore, the time functions are decaying exponentials.
For a system composed of two series elements, the system re--2ýt ". . • . - o '
liability is unaffected by repair, and R(t) = , whichcan be obtained by setting u"= 0 in the expression for ps 0 (t).
For a parallel system (standby or ordinary), the reliability
is given by p s(t) + p l(t). By appropriately choosing coef-
ficlents X/, P, and u' (as given in Table 3-2), a large number
of systems can be mcdeled.
Table 3.2 Coefficients of X u (Ref 59)
Repair/Non-repairable Type Repair Crew X • _
non-repairable parallel 0 2X 0 0
non-repairable standby 0 % 0 0
repairable parallel 1 2X P
repairable parallel 2, no 2X P 2PJoint effort
repairable parallel 2, 2X Bi 2vJoint effort
repairable standby 1 % V
repairable standby 2, no X p 2)Joint effort
repairable standby 2, X Bu 2vjoint effort t
62
Ae
Systems with more than two components are treated also by
Messenger (Ref 58).
p..4i
41
Sit-IL
,La.
el
S5
4,,
So63
41 ....P) ___ __ __ __ ___- -_____ __ __ ___ Z.
ilk
"IV. Confidence Limits for Availabilities
of Maintained Systems
Exact Analytical Methods
40.
An estimate of system availabilty calculated from
time-to-failure and time-to-repair test data will be subject
to some degree of uncertainty due to the uncertainty associ-
ated with the sample estimates of MTTF and MTTR. This chap- .
ter presents techniques for determining a lower confidence -
limit on system availability when time-to-failure and time-
to-repair are independent, exponentially distributed vari-
ables. Also, the case of log normally distributed repair
times will be included.
The Case of Both ExponentialDistributions for the Time-to-Failure and Time-to-Repair
Mary Thompson (Ref 83) presents the techniques for de-
termining a lower confidence limit by making use of the one-
to-one correspondence between availability and the ratio of
mean-time-to-reoair and mean-time-to-failure in order to de-
fine F-distributed variables upon which the confidence limits
are based.
The availability, as defined before, is usually defined
as the probability that the system is operating satisfactor-
Ily at any point in time. This probability can be expressed -7
mathematically es
+ +
~k. e+ ~ 1 (~/8
- -- =--= - -- -~, ~ - -
where
e = system mean-time-to-failure
4 = system mean-time-to-repair
The one-to-one correspondence between availability €/8 is
obvious. The usual sample estimate of availability is
eA (4-2)0+¢
where 0, the sample estimate of 6, is calculated from
n,e = E t 1 /n1 (4-3)i= 1I
where
t = time between the (i - l)th and the i-th failures
= number of failures
and 4, the sample estimate of €, is calculated fromn 2
Z lt 2 j /n2 (4-4)
where
t = time-to-repair associated with the J-th failure
2J
n2 = number of repair actions initiated
It is assumed that t1 (time-to-failure) and t 2 (time-to-
repair) are stocastically independent random variables with
probability density functions
fl(tl) = - e (4-5)
65
and
f 2 (t 2 ) = e 2 (4-6)
If we consider a random sample of n1 times-to-failure
and n 2 times-to-repair drawn from the above populations with
random sample means 8 and 0 calculated from equations (4-3)
and (4-4>. It is well known that 2n08/6 and 2n 2 •/4 are chi-
square distributed variables with 2nI and 2n 2 degrees of
freedom, respectively. Since they are independent due to
the independence of the variables t, and t 2 , it is possible
to define two new variables:
2n 1 /0 2n2/1= n ( - (4-7)
2n2
which is F-distributed with 2nl, 2n 2 degrees of freedom, and
z (4-8)2 =z
which is F-distributed with 2n 2, 2n1 degrees of freedom. The
variable z can be used to obtain a lower confidence limit
for availability A as follows (Ref 83)
pr F 1 _ 2nI 2n ) }i.e.,
S* ~~(2n 2n) -apr { --- F1 - a2 n -
66
pr 1+. 1+4F (2n,, 2n 2 ) = 1 - a
"~I + /+ F -a (2nl, 2n 2 ) +
pr 11A+e + 0 F 1 (2n., 2n 2 ) (4-9)
Most practical cases n 1 n 2 n and equation (4-9) becomes
•r @< A =1 - (4-10){ + € F 1 -(2n, 2n)
The (I - a) lower confidence limits is found from
LCL 6 (4-11)
+ 0 F1 _ (2n, 2n-
A two-sided (1 - a) confidence interval, derived in a similar
manner, is given by
LCL A ^ (4-12)
a + * F1 - aI2(2n, 2n)
UCL = F j.2(2n, 2n) (4-13)
SF! a12( 2 n, 2n) +
Confidence intervals calculated from equations (4-11), (4-12),
a: ! (4-13) cover the true value of availability 100 (1 - a)
percent of the time. The curves of the 0.9 and 0.95 lower
confidence limits on availability calculated from equation
'4-1!) for values (€/6) ranging from 0 to 0.3 and for
6 7
I ~ -=~~ ~ - - - -=~~ ----~-~----~-= -~
for selected values of n between 2 and 50 are given in (Ref
83,.
Suppose for example that during the field testing of
a com~munication system, five T'ailures were experienced. Thie
average time-to-failure e was 125 hours. The average time-
to repair € was 3 hours. Previous experience with similar
systems indicates that time-to-failure and time-to-repair are
exponentially distributed. Independence of the two -,ariables
assumed (in practical terminology, this means that the time
required to fil a failure does not depend on how long the
equipment operated prior to the failure) a point estimate of
the system availability.
0 125 -+125
To find the 90 percent lower confidence limit, compute
-4-= 3 - 0.024e 125
From the curves given (Ref 83), we can read directly the LCL
on availability with ;/^ = 0.024 and n = 5; the 90 percerc
LCL is 0.95.
Lower Confidýence Limits Assuming Lognormally
Distributed Repair Times
Gray and Schucany (Ref 36) introduced the lower confl-
dence limits in the case of lognormally distributed repair
times using the tables of H. L. Gray and T. 0. Lewis (Ref 35).
-thc r ,ndom variables x amA y denote the reoair times and
68
times to failure, respectively, since the availability is
defined by 7
A -I--14-
WY WV
where ux is the mean-time-to-repair (MTTR) and ýy is the mean-X y
time-between-failures CMTBF). 1` is difficult to work ana-..
lytically with the assumption of lognormal x and exponential
j in trying to establish confidence limits on the ratio of
the means. Gray and Lewis (Ref 35) tabulate to some extent
the distribution of the ratio of independent lognormal and
chi-square quantities for known variance of the lognormal
distribution. These tables enable us to establish a lower
confidence limit for availability based on the assumption of
lognormal repairs.
Cray and Schucany (Ref 36) derived the following LCL
for the availability assuming lognormally distributed repair
times. If x and y have lognormal and exponential distribu-
tion, the respective probability density functions are given
by
h(x , L 2) 1 exp nx 1 , x > 0x2 s ]B I (4-19)
0 elsewhere
f(y; Uy Yexp[ , Y 0Y yy (4-16)
0 eisewhere
69
for the lognormal, distribution, the parameter a and 82 are
the mean and variance of £n(x), respectively; that is,
E[inx] - a
var~tnx] -
since
lx= Etx] = exp [a + ½B21
It can be seen that for a random sample of size n from
h(x; a, 6%, the quantity defined by
Q = x/e
where
[n 1 1/nx R(x1-/7)
is the sample geometric mean, is distributed lognormally with
parameters 0 and 6 2 /n. Also for a random sample size m from
f(y), it is known that
n• = 2my/ly
where
• m i(Y)
is distributed as chi-square with 2m degrees of freedom. The
quantities Q1 and 2 are clearly independent. Consequently,
if we let
70
w Q 2Qi 2ecmyuX2 y
then it is known (Ref 35) that w has E. probability density
function g given by
/ exp 2 n 28z
r(r) 2m(2 it)½ 0 2a2g(w)
-- ½w x~wz)r - w~z <w
elsewhere
(4-20)
The variable w can now be used to obtain a lower confidence
limit for the availability:
1
1 + exp[a + 8 12]I
as follows. For a given n/B2 and m, we may obtain a number
b such that
p~w < b C p[2emy < b C
b exp[a 2 /2] -
exp[a + 82/2] < _ _ _ _
by algebraic manipulation
1 + b exp'12/2]R/2m 3Therefore, the 100C percent lower con-fidence limit (LCL) is
given by
LCL2m (4-?2)2,n + b0 exp[82/'2]•
where oc Is obtained from the tables of Cray and Lewis (Ref
35).
Gray and Schucany (Ref 36) gave some figures which show
the lower confidence limits versus the statistic / for
several values of m and n for the single value 82 1.0.
Also, they gave some figures for different 82, but these fig-
ures are insensitive of the LCL to the number of observed
failures m and repairs n that comprise the statistic. This
insensitivity makes the figures difficult to read, since for
any giver problem, it is unlikely that m., n, and 82 would
match those figures.
7L
V. Mont Carlo Comparisons
Confidence Limits for
Availability and Reliability
Summary
Reliability and maintainability engineering are recent
and related engineering disciplines that make extensive use
of mathematical techniques. In analyzing the dynamics of
physical systems, certain probabilistic concepts have been
developed in order to account for and explain random obser-
vations (i.e., failures or repairs).. One of these concepts
embodies the theory of Markov processes, the idea that the
past has no effect on the future except through the present.
The applicability of Markov process theory and techniques to
the study of reliability and.maintainability engineering has
been shown in earlier studies.
A system is designed to achieve a given performance
and its quality is the degree to which it meets this perfor-
mance specification. Performance is normally specified in
terms of the acceptable limits of such parameters .as the
maximum permissible noise or the minimum acceptable output
power of a speech transmission channel, the maximum number
of lost calls or the maximum switching time in a telephone
exchange, the stability of a pow&• supply or the frequency
limits of an oscillator, and so on. It follows that system
failure is defined as a departure from these specified limits.
73
Failure may be defined at many levels whereas only a system
failure gives rise to complete loss of system use. A unit
or sub-system failure may or may not give rise to a system
failure depending upon the presence, or otherwise, of redun-
dancy. Redundancy is a design configuration, as in the dupli-
cation or triplication of units within a control system,
whereby the failure of some part of the system-does not re-
sult in a system failure. Reliability is frequently enhanced
by the use of redundancy, but the total number of unit fail-
ures requiring repair, hence the amount of maintenance, is
usually increased due to the additional equipment. Since the
availability is a measure of the ratio of the operating time
of the system to the operating time plus the downtime, thus
it includes both reliabillty and maintainability.
This thesis concentrates mainly on the confidence lim-
its for the asymptotic'availability of maintained systems.
Confidence limits for the availability A(t) and reliability
R(t) of maintained systems may be obtained in exactly the
same method as applied to the two cases of steady state studies
in this thesis as long as equations for the case have been
derived (see Chapter III, equations 3-5, 3-7, 3-8). Using
Table 3.2, a large number of systems can be modeled.
74
Results
Case 1. Exponential.failure time
Exponential repair time
Number of repetitions = 500
Mean-time between.failure = 100 hours
Mean time to repair = 2 hours= MTBF
Exact availability MTBF + MTTR
1= 00/100 + 2 = .98
Table 5.1 Results of the Double Mont Carlo Technique
Exponential Failure and Repair Times
Sample No. of Trials, Actual Percentaie CoverageSize :per Repetition 95%- 90% 85% 80%
10 100 100 91.16 85.8 80.2
20 100 100 89.-6 85 79.2
30 100 100 89.8 85 80.2
10 200 95 89.2 84 . 2 79.4
20 200 95.6 92.4 87.8 82.2
30 200 95.6 90.8 87 82
10 500 94.4 90.4 85 80.4
20 500 94.8 90.6 84.8 78
30 500 95.4 90.8 86 82.2
75
Case 2. Exponential failure time
Lognormally repair time
Number of repetitions = 500
MTBF = 100 hours
MTTR = 2 hours
Exact availability = .98
Table 5.2 Results of the Double Mont Carlo Technique
Exponential Failure Time and Lognormally Repair Time
Sample No. of Trials Actual Percentage of CoverageSize per Repetition 95% 90% 85% .80%
10 100 100 92.1 91.3 87.4
20 100 100 88.9 82.3 81
30 100 100 91.4 86.2 80.5
10 200 96 92.3 88.5 82.3
20 200 99.3 92.2 86.1 82.8
30 200 99.3 93.6 88.1 81.4
10 500 99.3 93.8 87.4 83.6
20 500 99.3 93.6 86.7 82.8 '
30 500 -9.3 92.8 88.1 81.5
76
We get accurate results in the case of exponential failure M)
time and exponential repair time. A set of optimistic re-
suits was obtained in the cabe of the lognormal repair time
W.ith exact availability .98; that is, when repair times were
correctly assumed to be lognormally distributed, the coverage
was greater than in the situation of exponential repair time. A:4-
The same conclusion has been obtained by Gray and Schycany
(r.ef 3 etcer results car. be obtained wiuh smaller exact
availabil"ty.
€1
*Ark
-t,
ii
B=•,,±O.•A0dH
J l;
S:It
ii@
-:E . .I• .
1. Amstadter, Bertram L. Reliability Mathematics Fundamen-tals; Practices; Procedures. New York: McGraw-Hill, Inc.
2. Anderson, D. E. "Reliability of a Special Class of Redun-dant Systems," IEEE Transactions on Rel!abillty, R-18:21-28 (February-T9 9).
3. ARINC Research Corporation. Reliability Engineering,edited by William. H. Von Alven. Englewood Cliffs NJ:Prentice-Hall, Inc., 1964.
J. Arndt, K. and P. Franken. "A Random Point ProcessesAppl ied to A-alailt Anal- I.dtoAa~lab-liy Analysis of Redundant Systems
with .Repair," IEEE Transacti ons on Reliability, R-26:26-269 (197,.
•. Barber, C. 7. "Expanding Nantainability Concepto andTechniques, " 15,Th Transactions on Reliability, 0-16:5-9(Ma •:'a"967).
6. Ear-ow, R. E. and F. Proschan. Availability Theory forMuitfcomoonent System Mu-tivariate Analysis III. 1973.PP. 3I9-335.
7 Barrett E. H., R. N. Miller and M. J. Smith. "Comput-r z ed 'areov System Effectlveness Mcdels," Proceedings
of the 169 ".nnual SSmposium on Reliabilitvy 535-L2January, 1969. 'IEEE Cataog No. 9-8-R)
8. Bazovsky, i. Reliability Theory and Practice. EnglewoodC s : Prentice-Hall, Inc., 1
9. 7enning, C. •."Reliabit Prediction for Standby,-;" Redundant Structures," IEEE Transactions on Reliability',-- 6 " 36-137 (December 1967).
10. Etrrklu , K. F. an:d J. K. Dvers. " .,l.a.lilty:ixact Bayesian Intervals Compared wi'h Fid'icial Intervals."
ETransactions on Reliability, R-2P : 199-- (uust
1m1 1r -am, J. E . and A. H. Moore. "Estimation of :MlissionneliaDi!-Ltv from ultpe . ndeendent roued Cens-red,
Tvzi., " .týELcns on Re a1j 1,i-t,- 7z- r -7 -6
1. "Estimation of R'eliability fr - ul.tie indeuen-ltroued Censored Zamo_,es W.zh n.iurc r.;K-,,TOE ... C..... .... . !'" -•= . , • .. " --
I-rhe •.c ,
_.,." erich "Applications of
13 Blakney, K. Z. and F. L. Dietr ch. nMarkev Processes to Reliability and MaintainabilityEngineering." Unpublished master's thesis. GRE/MATH/66-3114. AFIT/EN, Wright-Patterson AFB OH, December 1969.
i•. Bovaird, R. L, "Characteristics of Octimal MaintenancePolicies," Mana__ement Science, 7: 238-253 (April 1961)
15. Brender, D. M. and M. Tainter. "A Markovian Model forPredicting the Reliability of an Electronic Circuitfrom Data on Component Drift and Failure," 1961 IREInternational Convention Record, Part 6. 230-24I7March, 19.
"_6. Br-sh, I. Maintenance Policies and Disciplines InMulti-Channel Syems and Markovian Multi-ServiceSystems. Technical report no. !-. ew York: CclumtlaUn1versity, November 1, 1966. AD 650792.
17. Carlen, Rowland. A Practical Approach to •e•eablilt•.London: Mercu. . '7
1• . Chowd, D. K. "Availability of Some Reoairabie ComnuterSystems," IEEE Transactions on Reliability, R-24: 64-66(April 1975-
19. Clarke, A. Bruce and Raalph L. Disney. Probability andRandom Processes for Engineers and' Scienists. NewYork: John Wiley and Sons, inc., 1970.
20. Constantinides, A. "An Oneratlcns Research Aonrcach toSystem Effectiveness," Proceedings of the 1969 AnnualS, osium on Reliability. 250-255. January 1Q99.kIEEE Catalocr No. -FQ.c . and .. -ler. The Thecry of Stochastic
Processes. New York: John Wiley and Sons, Inc., ,9b5.
2. ...•�-illon, E. S. and "har-on -'nir niering RelIability,New Technicues ana A..]ications. " York: Jo*n 1le:and Sons, Inc., 8i
23. Toan' L. R. "Solving for the .. TT of Comrriex Systems,"-rcc'eeoinzs of' the :c~9 Annual S'oosium • on'5 -, 1. January I E- -- ' a taf3a~o •o ""-"
and .... Sersenbru;:e S"Scvl.,o Comn-ýex 7e-1ia-b il!1t,' Models by Coriiutler," ' oc-se*ThflA.r~nual• Ev','rnoSlun on celi~h!!itv . / uR-1 4'3 li anur / '
,.'• =,c'c'__,.~ ~~ ,a•, _ nl u• aU t;rý-
23. Earrnest, C. A. "Estimating Reliability After CorrectiveAction: A -ayesian Viewpoint." Master's thesis.Monterey CA: U. S. Naval Postgraduate School, Mlay 1969.
26. EckIes, J. E. "Optimum Maintenance with IncompleteInformation," Ooerations Research, 16: 1058-1067(Sertember-October 1968).
27. Efron, B. "Eootstrap Method: Another Look at the Jack-knife," Annals of Statistics, 7: 1-2. (January 1979).
25. Epstein, M. and G. Weinstook. "Reliability with Allow-able Downtime," 1969 Annals of Assurance Sciences,E"ghth Reliability and MaIntainab-lity Conference.36-39. July 1969.
29. "Evaluatink h= KTI M-ont Carlo Method for System Relia-bi!!t; Calculations," IEEE Transactions on Reliability;,R-23: 363-372 (December-- 979).
3C. 7--Ferton, R. A. "Further Acplications of Markov Processes!o Reliability and Maintainability Enr•.ýeeri• n ."..ncu.-
!isned mast.er's thesis. GRE/MATH/69-2 AFT / NJ Wri Cght-
Patterson A,. OH, December 19.
32. Oatliffe, T. R. Accuracy; Analysis for a Lower ConfidenceLimit Procedure for Systemi ?-. ati 1-. .YMonterey CA:]iaval Postgraduate School, I " D. AO3il7. Aval nble
fromrNTIS (Nat4onal Technical information Service).3 2aer, D.. and 4. Mazundar. Statistical Estimation
in a Problem of $Sstem Reliability. Iqnragemeent SciencesResearch Report No. 69.' Pittsburgh: "arnegie Instituteof Technclog,:, February 1967. AD 648085.
33. Gerlach, r,. and "t.. Warmuth. "Some Remarks to SystemAvailabi-it,:," Mathematical Cnerations - Statistic, 7:
34. Grace, K. ".rproximate System Availability Models,"Proceedin=s of the 1969 Annual Symnosium on Reliabity1`5-2 aal 7T EE Catalog -c .... 9C6-R. ,
3 Gra-y, H. L. and T. 0. Lewis. "A Confidence Interval fcrthe Availability," Technomet r', 9: 465- 4 1 (August
- --- ald 1La, m S cn-ca n cy. "Lower CO '0 d"'ece LimitsPor A••a'a-!t issumln; LC'gnor,,m-all' [ ' d+''it'.e ... .Reoair
..... _, ran-L act' ipPs o-eliabIt y , Co[
.Io~-..
Qi!
37. Grippo, G. "Evaluation of ReliakAlity and AvailabilityMeasures for Large Complex Systemns," Proceedings of the1969 Sprin Seminar on Reliabiltty, Maintainability -Parts, Microelectronics, Syster.s. 33-49. April 1968.
38. Hevesh, A. H. and D. J. Harrahy. "Effects of Failureon Phased-Array Radar Syste'-_," 7EE Transactions onFeliab-tlbt, R-15: 22-31 (May 17M,.
39. 'Hillier, F. S. and G. J. Liberman. Introduction toOnerations Research, Third Edition. San Francisco:Holden-Day, Inc., 1980.
43. Jeiinek, F. Probabilistic Information Theory. New York:
McGraw-Hill Book Co., Inc., 1,66
41. Kamat, S. J. and V. W. Riley. "Determination of Relia-biit' Using. Event-based Mont Carlo Simulation, IE-Transactions on Reiibitv, R-24: 73-75 (April 1 )
2. Kamien, . I. and . L. "Maintenanceand Saleage for a Machine Subject to Failure," Manage-ment Science, 17: 495-504 (April 1971).
"3. Kapur, K. C. and L. R. Lamberson. Reliability in BEngb-neering Design. New York: John Wiley and Sons, inc.,
1 7.
" "4 Kaufmann,,, A., D. Grouchko and R. Cruon. Ma•hemati*alMoidels for the Study: of the Reliability of Systems.New York: Academic Press, Inc., 1977.
.•5. Kinderman, A. J. and J. G. Ramage. "Comcuter Generatýlcnof Normal Random Variables," Jourral of the AmericanS tatirtcal Association, 71: .
ance Protlems. Technical •nozrt no. 2". New York:Statist cal E ne e r in zrc-, Clbi" a . e rsit-December A96L A 6D 6 7•7;.
Tanon, • . " "AMon-. 'arlo Technique iXr A' r aiS......e~ ±n , lence Limiti s ,
istribu~tU- " Unublished master•s t-ei 7-
A.../ET , \ro- h• t -tte-son AFB OH, 19-_7. 7'
"Levey, L• .T d, IA- ooreW "A 'Mo't a] -1or ObSyain em Reliability Con. -e
C16:onen- Tes 7ata E Transac -s cn e3 7I a.-1i1n- i : 6•-_72 ( 6 )
¶9. loyd, D. A. "Reliability Growth," Bulletin of theInstitute for.M~athematical A~pplications, 12: 2'47-2;1
50 Tocks, M. 0. Mont Carlo Bayesian System 'Reliability-and MT-BP-Confidence Assessment. Tecn~c-al report,777L-uT-75-144, Air Force Flight Dynamics Laboratory,Wright-Patterson AFB CH. AD-AQ157068.
51.---------ont Carlo Bayesian System Reliability - and611TBF-Confizdence Assessment 11, Vol. 1: Theory, Vol. I:I:SPAR'S - 2 U'sers Manual. Tech:71cal report, AFFDL-TR-75-18, Air Force Flight Dynamic Laboratory, Wright-Patterson AFB OH. ADI-A025820.
--2 --- Reliablit-;', ',aintaIn~abilit,,,, and Arail ab~li tyAssessment. HoreePakY: iayden Book Co., inc.,
1973.
T3 notor, S. N. oA no Carl o T-chnixoe o'r Acporxlmat~np-S*2-z.tem- Reliabllt,: Conf--'aence Limits from Comoonens
Fali 're est- Da-!a. Unpublished nazter's thess.G/M4A ' -9. 1 AT / E, 1 -Wright-Patterson AFS OH.
Ma. i~ntainability Specilficatlcn of ?earTime Reouire-merits," IEIEE Transactions on Reliability, R-29: 13-16(April 1-980).
~.Mann, NN. R. and F. E. Qx-rubbs. "Approximately OptimumConfidence Bounds for System Reliabil-ity Based on Cn-,cnent Test Data," Technoinetrics, 16: 335-3i47 'Augu.st 1974).
,Ray E. Schafer and N'ozer D. Singpu-rw-a-la. Netho d sfor~ ~ a, s ca!Avsls of Reliability and, Lifeý rD'a ta .
Ne;.' "' o r'. ,Th-, Wiley and Sons, Tnc. , 197,11
57. "IAý. "Anoroximation Formula t'or Fýýe'iaib;ltT .WT Tansactions on. Reliabilit T.1
6!4-97 Deceriuter ?T
_____ t_____ -ailabil-t.. Research' Recort- TL~j> ok Poly'technic lnstitut~e
cOC j--n S' ''q.L
on! Carlo -ai r ; z!ncom Var hies,"1 3E 7Tr n s-
.)U -or-e '-"Rxe~~ru or*t "arlo Techr'ique GU;;ir;
6 E cI C DaL c, crced L~ ialna I e on ean.
61,-- H. Leon Harter and Robert C. Snead. "Comparison
of Mont Carlo Techniques for Obtaining System Rella-bility Confidence Limits," .EEE Transactions on Rella-bilit, -29: 327-331 (October 1960).
Munroe, A. J. "System Effectivenesz of Military SystemsMathematics and History," presented at George WashingtonUniversity, Washington DC. Program of Continuing Educa-tion - System Effectiveness Course. June 1968.
63. Nakagawa, T. and Yasui, K. "Approximate Calculation ofSystem Availability," IEEE Transactions on Reliability,R-26: 133-134 (June 1977.
64. Natarajan, R. "Assignment of Priority in improving Sys--tem. Reliability," EET ransactIons on Reliabilit:i, R-16:m34-110 (December TE77.
" "- .crking Eibi ozraphv -n ivarkov ?enewaProcesses and Their Applications. Mimeograph Series No._40. Lafayette I`D: Purdue University, Decartment ofStatistics, February 1-968. .. D -778.
F n . " iJat A'-.I na ysIs T sin F o wgraphs."
Unpublished master's thesis. GRE/1EiE/63-3. AFIT/EN,Wright-Patterson AFB CH, August .. 63.
67. "Ootima_ Periodic Maintenance Pc! icy for Machines Sub =e-to D,ýteroraticn and Random Breakdow_'n," !EEE 'ra.nsacti.nson leliability'1 R-28: •28-30 (O.tober ....
.?apzo-lcu, 1. A. and E. P. Gy-ftopoulos. ":,':rkov Processor .Reliability Analysis of Large Systems," ____ <r ns-
actions on Reliability, -26: 232-237 (Augut •97--,
69 .Put: :e. B. "A Univarla-e Mont Oario 7Technique tocrx amt•ate RPel1abiIity Confide-ce _ -.- c Sosteins
with "or'onents Characrerized bJ tI; I,,'i., -,tzcn. Unpublished master's t GC:/A,',9D-,. t.n T.
't,*r`l_,g-h-atterson aF'3 OH.
70. T- . "Reliab"ity Systoo in 'ai- Envlror men>"
7. "Reliabi•_ly and M•ntalna-tit .1 "1 1o," ... FE .actons cn Reliability Q:3 2 60 oEb_ e r -f .
I a7 '..._.2 8 26- r:- , .... ; 71i---7? -Ot b r Qe2 I"-.liaiit Evaluat ion. 12o r Iths. Comet o. "s;.
... i. s_ of CoT-ci et= e [ .. 3 k , zi-; ? • n 9c - • •• •
aýý :i R 2ý: 2
73. Sandler, 3. H. System System R.eliabillty "Engý_neering.Englewood. Cliffs NJ: Prentice--Hall, Inc., 19-53.
7~4. Sasaki, M1. "1Slide Methods for Redundant Mission~ Avail-abillty,," -rocceedini-s of th~e 1969 Annual S~ympo-Siim onR1iat.lity. _ý7,-07- January '96. (IEEE Catalog Nc.
75.---- and T. Yana'. "A~vailabilities for Fixed PeriodIcSy'stem (ýwIth Su~pp'ement)," IEEE Transactions on Relia-bility, R-26: 300-302 (October 1977).
76. SGik . J. "markov ?:oessand Some Ae-osoaceApplications." Paper presented at the 48th annual meet-Ing of Specific Division of American Asszociation for the
Advnceentof ýScie.1e, Los Angeles CAJn-210777. hw L. and ?.L. Shoom-an. "ConfidenceŽ Bounds arid Prcn-
acgatlon. of Ur.niertafnties in System AvaIlatilItv ari Re-ibIIty Computat' -ons," N-aval Research Ic-7ist"c~uatory, 3: 673-6E(1T)
7-. Mhcr. L. ' Frobabilist.ic Relialof11t: An E-cIneer--i Aotcroach. N.ew York: McGiraw-Hill Book Co. , Inlc.
7?. mth avsJ. and A.lex 11. Babb. Ma-*nrainabilit
7.n:-neei~ . Ye JoDk: ýhn Wiley and Sons, Inc. ,'Q73.
3:.~~ as ;~orn, -.~anOdb R~ecunulaii ', o hNn'mncu s w i '-; ~- e r T.EE.E Transactions on Pella-
tif- ty R-17: 175 17-n S te-mier 19'-1).
QI. ý'-e -e iab'o!'--,'ý'-a'ntalnabilitv. textbook, 'Sec-tion SI trouý-r` 1', a17 Wricght -'atterson CF H.
3.Thompson, Mary. "ILow~er Confidence -imits and a Test OfFHypothesis for System Availability," 71 msansacticns
~R e ab'4 1 it y , 7- 1 : 3 2- 36 ( Ma., I96D
Tin, Htun, L. "Rel11ab il _ty -- ed1 t ior) In, s zu'e½-
A, S cQo
0cs - ri an 'U~" ~ En ~ ~c a -c -s c lI 1 t v _ 1 i-
6. Tuokwell, F0. "On the First Exit- Time Problem fo,-r:emporarily H-omogenous Markov ProccsseS," Jourr.al oApplied Protbailit", 13: 3q-48 (196)
87. Tlumolillo, T. A.4 "et. for Calcuilatiz t.he Rýellab4l-1ity ?unctio'.i for Systems SubJected to Randiom Stresses,"IElEE Transactions on Reliability, 3-23: 256-262 (Octbber19%4).
83. Wash - ¼3I0 Reactor Safety Study - Appendix 2 - FaultTrees. 55. Octoberfl75. PB:2 48efl372G. (Avail-ffTeT-from. INational Tecnca fnormat'1 onzSr~e N)
Weiss, G. H. " Probem In au enM-tenaMl-anagement Scince 266-277 (Januart I,065Y.
9C. W~olman, W-. "rri-c bems inSytm eialw:Alss,
iae4ino 1%S~' e ,,n IverIs t, of WIs cocnsi r~ezss,1963, 4
91. longrer, P -i.and T. J.iarmv,-.ak<. L~tlt etf'4 , ied- to -",crn S.,tlems. ecnal eor %
5? 1Ž739. Baltimore MD: Mri-a~ta etme131,1962.
APPE-NDICES
APPENDIX A
MAJOR REFERENCE SOURCES
88
Guides tc Books
:. Card Catalog
Guides to Periodicals, Journals, Reports, etc.
I. Applied Science and Technology Index
2. Defense Documentation Center Bibliographic Search Facil-
ities
3. The Engineering Index (Engineering Index, Inc.)
4. Guide to library Research Resources (Air Force Instituteof Technology Library, August 1969)
5. international Abstracts in Operations Research (OperationsResearch Society of America)
6. International Aerospace Abstracts (American Institute of
Aeronautics and Astronautics, Inc.)
7. Mathematical Reviews (The American Mathematical Society)
8. National Aeronautics and Space Administration LiteratureSearch Facilities
9. PANDEX Current Index to Scientific and Technical Litera-ture (CCM Information Sciences, inc.)
10. Quality Control and Applied Statistics (Executive SciencesInstitute, Inc.)
11. Reliability Abstracts and Technical Reviews (NationalAeronautics and Space Administration)
12. Scientific and Techrnioal Aerospace Reports (National Aero-nautics and Space Administration)
13. Selected RAND Abstracts (The RAND Corporation)
!14. Statistical Theory and Method Abstracts (The InternationalStatistical Institute)
15. Tecnnical Abstract Bulletin (n-fense Documentation Center)
1. "•. S. uovernmernt Research and Development Renorts (Clear-ing House for Federal Scientific and Technl•cal Information)
89{
I'I
APENDIX B
GENERAL REFERENCE BIBLIOGRAPHY
90
i. Aggarwal, K. K., K. B. Misra and J. S. 3upta. "A FastAlgorithm for Reliability Evaluation," IEEe Transactionson Reliability, R-24: 83-85 (April 197--7.
2. Barlow, R. E. and F. Proschan. Mathematical Theory ofReliability. New York: John Wiley and Sons, Inc., 1965.
3. Breiman, L. Probability and Stochastic Processes. Boston:Hought on-Mifflfn, 1969.
4. Chambers, J. M., C. L. Mallows and B. . Stuck. "A NewMethod for Simulating Stable Random Variables," Journalof the American Statistical Association, 71: 340-344Apri l76.ATE)i_1 1-97, .
5. Chung, K. K. Markov Chains with Stationarv TransitionProbabilities, Second Edition. New York: Springer-Verlage, 1967.
Cramer, H. "Model Building with the Aid of StochasticProcesses," Techncmetrics. 6: 138-159 (May 196ý).
7. Endrenti, J. "Algorithms for Solving Reliability Modelsof System Exposed to a 2-State Environment," IEEE Trans-actions on Reliability, R-24: 281-285 (October 1 T
8. Feller, W. An Introduction to Probatility Theory and ItsApplications, Vol. I.. Third Edition. New York: JohnWiley and Sons, Inc., 1968.
9. Fussell, J. B. "How to Hand Calculate System Reliabilityand Safety Characteristics," IEEE Transactions on Relia-bility, R-24: 169-174 (August 1975T.
I). Karter, H. L. and A. H. Moo:e. "Asymptotic Variancesand Covariances of M... "u..um-LIkelihocd Estimates fromCensored Samples, of the Parameters of Weibull and GammaPopulations," Annals of Mathematical Statistics, 38:557-570 (April179--6•--
11. -------- "Local-Maximum-Likelihood Estimation of the Para-meters of Normal Populations from Singly and Doubly Cen-sored Samples," Biometrika, 53: 205-213 (August l966).
12. "Maximum-Likelihood Estimation from CensoredS.n.e.. .f thtI:• of a LoDLztic Di-tributilon,Journal of the American Statistical Association, 62:575777 T~ue. T--967).
13.----- "Maxmum-Likelihooa Estimation of the Parametersof Gamma and Welbull Populations from Complete anraCensored Samples," 7echnometrics, 7:639-643 (November1965).
91
14. Jayachandran, T. and L. R. Moore. "A Comparison ofReliability Growth Models," IEEE Transactions on Relia-Llity, R-25: 4 49-51 (April 1797)--
1=. Kemeny, J. G. and J L Snell. Finite narkov Cha•s.
Princeton NJ: D. Van Nostrand Co., inc., 19-66.
16. Miller, I. and J. E. Freund. Probability and Statisticsfor Fineers. IEnglewood Cliffs NJ: Prentice-Fall,Inc ., 1365.
17. Neuman, C. P. and N. M. Bonhomme. "Evaluating of Main-tenance Policies Using Markov Chains and Fault TreeAna>ysis," TEE Transactions on Reliability, R-24:37-45 (April 197-.
18. Olsen, D. E. "Estimating Reliability Growth," IEEETransactions on Re'Liabilty, R-26: 50-53 (Auril 1977).
19. O'Neil, T. S. "System Reliability Assessment from ItsCAomponerts, Apled Statistician, 21: 297-320 (172.
?0. Singpurwaila, N. D. "Estimating Reliability Growth (orDeterioration Using Tme Series Analysis)," Naval Research
-~stIcs Quarterly, 25: 1-14 (Aoril 1978).
21. Szringer, M. D. and W. E. Thompson. "Bayesian Confi-dence Limits for the Product of N Binomial Parameters,"Biometrika, 53: 611-613 (December 1966).
2.-- -"Bayesian Confidence nimits for the ReliabIlityof Cascade Exponential Systems," IEEE Transactions onReliability, R1-16: 86-89 (Septemb-er1967).
23.---- "Bayesian Confidence .imits -for the ReliabilityRedundant Syste:ns W-1hen Tests Terminated at First Fail-ure," Technometrics, 10: 29-36 (February 1968).
24. Thompson, W. E. and P. A. Palicio. "Bayesian ConfidenceL_-nits for the Availability of System," IEEE Transactionson Reliability, R-24: 118-120 (June 197-7-.
25. Vit, P. "On the Ecuivalence of Certain Markov Chains,"Journal of Apclied Probability, 13: 357-360 (April 1976).
92
APPENDIX C
SUPPLEMENTAL BIBLIOGRAPHY
93
1. Althaus, E. J. and H. D. Voegthen. "A Practical Relia-bility and Maintainability Model and its Applications,"Proceedings of the Eleventh National Symposium on Relia-bilitý, and Quality Control. 41-4b. January 195.
2. ARINC Research Corporation. Maintainability Prediction.Publication No. 267-02-6-42C. Washington DC: Aeronau-tical Radio, inc., 31 December 1963. AD 428817
3. Barlow, R. E. and E. M. Scheuer. "Reliability GrowthDuring a Testing Program," Technometrics, 8: 53-60(February 1966).
-. and R. Proschan. Statistical ThorZ of Reliabilityand Life Testing. New York: Wolt, Rinehart and Winston,
5. Barzilovich, E. Y. and V. F. Voskoboev. "Markovian Prob-lems in the Maintenance of Aging Systems (A Survey),"Automation and Control, 28: 1890-1904 (December 1967).
6. 5iondi4, E., et al. "Analysis of Discrete Markov SystemsUsing Stochastic Graphs," Automation and Control, 28:
7. Borgerson, B. R. and R. F. Freitas. "A Reliability Modelfor Gracefully Degrading and Standby Sparing System,"IEFE Transactions on Computers, EC-24: 517-525 (May 1975).
8. Bosinoff, i. and S. A. Greenberg. "System EffectivenessAnalysis: A Case Study," Proceedings of the 1967 AnnualSymposium on Reliability. 40-53. January 167.-(IEEECatalog No. 7-'50).
9. Butler, D. A. "The Use of Decomposition in the OptimalDesign of Reliable Systems," 0Oerations Research, 25:4509-468 (May-June 1977).
10. Chou, T. C. K. and J. A. Abraham. "Performance/Avail-ability Model of Shared Resource Multiprocessors," IEEETransactions on Reliability, 5-29: 70-74 (April 198t-.
11. Coleman, J. j. Evaluation of Automatic Checkout forSystem Availability, Reia-b-i-ity an-d M-intanab•!TFyConference Technical Papers. 474-482. May 1953.
12. Dal Cin, Mario. "Availability Analysis of a Fault-Toierant Computer System," IEEE Transactions on Reliabil-
_6 -2.. .. .. . I ug 'A_ ... •
9)4
3.Dermar., C. "On Optimal Replacement Rules When Changesof State Are Markovian," in Mathematical Optimiiationlechniues, pp. 201-210. Edited by Richard Bellman.Los Ang~eles CA1: universit.y of California Press, 1-963.
P4I. Eliot, C. C. and W. H. Sellers. "Autocmated MissionAnalysis by Markov Chain Techniques," IE7EE Transactionson Aerospace and Electronic Systems, S'u-PFeme-nt,- AES-2:27-209 Ziuly-7-67 T
15. Emoto, Sherrie E. "A Problem in Equipmnent, Mainternan:!e,"Manageement Science, 8' 266-2717 (April 1962).
`6.---------and Ray Z. Schafer. "Orn the Specification ofRepair Timne Requirement~s," IEEE Transactions on Relia-bifllt, R-.29,: 13-16 (April l9700- ___
1-7. Evans, I. G. and A. H. M. Nigrn. "Bayesian 1-Sample Pre-.diction for the 2-?arameter Weibull Distribution," IEEE
`rnatoni or. Reliabilit3,, R-29: 4'0-ý`4 (December
18.l F'inkelstein, Jack M."Starting and LimitIng Values forR-eliabllity Growth," IEEE Transactions on Reiiati:,R-28: 111-114 (June T9_79). ~lti
19. Plehinger, E. J. "A General %lodel for the ReliabilityAnalysis of System Under Various Preventive M-aintenancePolicies," Annals of Mathematical Statistics, 33: 137-156' (March 1962).
20. Friedman, H. D. "A Universal Stochastic Model for Cer-tain Reliability Queuing and Cont~rol Systems," Sulletinof' the Operations Research Society of America, 77 1.3--
21.Gayr, . F, Jr "'2Ime to Failure and Availability ofParalleled Systems with Repair," IEEE Transactions onReli-ability, R-12: 30-3z8 (J3une 19737.
22. Ge-ha~rt, L1. IS. and W. Wolman. "A Probabilistic Modelfor RelIabil~Ity Estimation f'or Space System Analysis,"Blulliet-n de l'Institute int~ernatioa eSaýt~e33e session (1961).
4'ý. Hatoyana, Tuklco. "Reliability Analysis of 3ý-St-ate S','s-tern," IEEE TrFensactions on Reliability, R-28: 3,86-393('ecember 1-979).
-4. Jennings, HI. A. "Reliability and Maintainabilitv Anal-Y sls of a Twc-'Lear Manned Spacecraft. Mission," Journaloýf 'Sr,,acecraf't* and Rockets, 6: 327-33ý-6 (.March 1_969.
95
25 Kig Ter T T.i
25. K~ng. Terry L., Charles E. Antle and Russell F. Kappennan."Samloe Size for Selecting Most Reliable of K Normal
(Lognormal) Populations," IEEE Transactionis on Reliabil-, R-29: L44- 6 (April 1897).
26. Klein, M. "Inspectlon-Xaintenance-Replacement Sched-"ules Under ý-arkcvlan Deterioratlor.," Management Science,9: 25-32 (October 1962).
27. Kilon, Jerome and Kenneth C. Klotzkin. "Analysis Tech-niques for Maintainability Specification," froceedingsof the 1967 Annual Symposium on Reliability. 50--5•7.
28. Kneale, S. G. "Reliability of Parallel Systems withRepair and Switching," Proceedings of the Seventh NationalSymposium on Reliability and Qulity Contr7ol.129-133.january 19-'l.
29. Kontoleon, J. M. and N. Kontoleon. "Reliability Analysisof a System Subject to Partial and Catastrophic Failures,"1ZEE T'ransactions orn Reliability, -_-23: 277-278 (Octoberl9•).
3C. Kraut. W. K. Probabilistic Derivation of Ootimal Main-tenance Schedules for Cyclical Eauioment. TechnicalReport No. IATF-EN---077.Lkehurst NJ: U. S. Naval AirTest Facility, April. 196. AD 4..040,.
31. Kumamoto, Hilrcmitsu, Kazuo Tenaka, Koichi Inoue andErnest J. Henley. "Dagger-Sampling Mont Carlo for Sys-tem Unavailability Evaluation," IEEE Transaction~s onRellability, R-29: 122-125 (June 19 0 .
32. Kumar, Ashok and Manju Agarwal. "A Review of StandbyRedundant Systems," 1EEE Transactions on Reliability,R-29: 290-294 (Octob•er1980).
.. Lee, Elsa T. and David R. Thomas. "Confidence Inter-vals for Comparing Two Life Distribution," IEEE Trans-on Reliability, R-29: 51-56 (April 1980).
34. Lewis, P. A. W. "Implications of a Failure Model forthe Usa and Maintenance of Computers," Journal of ApoliedProbabijity, I: 3ý`-368 (December 10,647.
35. Locks, Mitchell 0. "The Fall-Safe Feature of the Lapp 8?owers Fault Tree," IEEE Transactions o. RelLabillt,R-2_: 10-11 (April i 0to).
36. ML-STD-471A, Mainrtainability: Veri fcation/Demonstration/Evaluation. Washington DC: U. S. Government Printing~?Tc7 ... arch 1..732.
37. McCall, J. J. Some Recent Developments in Maintenance":olicies. Technical Report No. P-3239. Santa M"onicaCA: The RAND CorP, October 1965.
3.- --------- e:1ab-ilty Anprox`matIons for Complex Structures,"Proceedings of the 1967 Annual Symtosium or, Peliabilit..292-301. January 19-,. TIEEE Catalog No. 7C .
39. Moranda, Paul E. "Event-Altered Rate Models for GeneralReliability Analysis,"' IFEE Transactions on Reliability,R-28: 376-381 (Decembe.-r1979--5.
40. Murch1and, J. D. --%ýnramental Concepts and Relationsfor Reliability Analysis of Multi-State Systems,"Reliability and Fault Tree Analysis, SIAAM, 58-68. .Philadelphia: ,9,5.
4L. Muth, E. J. Analysis and Formalas for System Availabi!-it'_ Computations. Technical Report No. -R66AS2. DaytonaBe•%-h FL: General Electric Co., 1ý M.arch :.966.
42 S-------Stochastic Theory of Thenairable S-ters. Ph.D.dissertation in System Science. Brooklyn : rBrocklynPolytechnic Institute, June 1967.
-3. Nenoff, L. "Operational Effectiveness of Antenna Track-ing/Cormmunication System Drives," Proqceed•.ns of the1969 Annual Symposium on Reliability. 472-461. Januaryl3U•.•EB Catalog No .69 CT9F})
44. Port, S. C. and J. Fo.kman. Optimal Procedures forStochastically Failing Equipment. Memorandum No. -R%11d59 -NASA. Santa Monica CA: The RAND Corp., Octoter1965.
45. Proctor, C. L. and B. Singh. "The Analysis of a Four-S e System," Mlcroele tronics and Del t," 1553-55 (1976).
4'. RCA Service Company. Maintainability En•ineering.Technical Report No. RADC-TDR-b3-b5, Vol. II. GrIffissArFB NY: Rome Air Devec.zLment Center, Air Force S~stermCommand, February 1963. AD 404898.
47. Rigney, J. W. Maintainability Prediction: Methods ar:-Results. Technical Report. No. 40. Washirngton DC:Fsychological Services Division, Office of N!avail Resear-h,June 1064. AD 603241.
48. Schmee, Josef and Wayne Nelson. "Pred 'ct ng from 7arl-Failures the Last Failure Time of a Locnormal Sample,"
--EE Transactionson n l, R_: 23--26 (April 1979).
97
"9. ~ ~ ~ ~ ~ ~ .L Shoa. .L "e.blt putat-ons for Systemsw44th'- Dependent Falllires.." ;?roceedir.gs of the 1966 Arn~jal
Svpsi, - orn Peliabi~iit' 1 I T-76.anar(Y--ZWCataloz No. 7\26)
50. Singh, C."Calculating the -Ime-Spec~fic 7'recquenzy o"Sy-stem, F.%i"ure," IEETransactiorns on R.eliability, -5i2 4- 126 ýTulne 197~77
751- -- =--- and A. D. Patton. "A Method for Predicting SystemDowntime," IZEE Transactions on Reliability, R-17: 07=102 (June ~.
52.-----and AIt D. Patton~. "?trotectlon System Reliabilityoel ing: Unreadines3, Probability and !Mean D-uracior, of
L>~itectd Falts, FEE T--ransactions or. Fel abili' v:q- 29: 339-341 ý(OCtober ih)
5 3. Strivastava, Z-. 3- "Rellabil-it-y of CplxSystems-sqeview," Journal of" Mathematical Scieinc-e, 2:, 111-1-7-3
5 4. Cte~1 . T. "Sufficient Stat'.stics, in the Goti-murnCorntrc2. of' Stochastic Systems," journal of Mathematics
an l~~n,12: 576-592 (DTe c em ber- 1Q-- l9T.
r-55. -I' TIm~a-, F. A., C.L wang and W~ay Kuo ."SteEIf-tiveness Models: an Annotated EBI'3iogrlaphy," ThEE.- Trans-actions on Reliability, fM29: 295-3014 (October 09870-
56. Tiwari, FR. K. and Mlikki Verna. "An Algebraic Techniquefor Reliability Evaluation," IEEEL Transactions o'n Relia-biit, R-29ý: 311-313 (Octobe-r 970'..
5 17. Tornashiro, M its,,;c, "A Multi-State System, with GeneralRepajr Time Distýributlon, !ZEE Transactions on. Relia-b'LIit', R-29; 3-3 (-e,6ember 1960)-
583. Weiss, G. F-. "Feliability of a System in Which SpareParts Del-eriorate in Storage," Journal of~esearch ofthe National Bureau of Standards. 66B:- 57-160
(OctberDecmbe Via4I)a59. Winter, B. B. and j. A. Abraham. "Performarce/rvaLla
ability Model of Shared Resource lvllut'iorýcessors," IZEF.Transactions on Reliability, 52: 70-7 (April, 197ý0,
cC. elntsov, B. I.A eto foSsemReliabilityAnl
*.~S. Docum e ntNo FT -H7-23-~ 7. Wright-PatUtersonASSB QH: Foreign Techriology DI-vision, 1 August !19,67.An 6633:54.
98
VITA
,ýohamed Shib. Biltagy Hasaba'la was born on 5 December
194ý in Cairo, Egypt. He graduated from Ain Shams University,
Faculty of Engineering, Electrical Department iln 1968. He
received a Diploma in Computer Science from Cairo University
in 1976. He also received a Diploma in Operations Research
from the same university in 1978. He served as the Commander
of Electronic Warfare Battalion which has been chosen by the
Commander of the Army as the best electronic warfare unit in
-he A.rm. He served as the Commander of Electronic Warfare
Laboratory before he had come to the United States 'fn June
!,0 to obtain the Masters degree in Operations Research.
Permanent address: 52, Hader El Toni St.Nasr City, Cairo, Egypt
99
h.e
UNCLASSIFIEDSECURITY CLASSIFICATION Of TMIS PAGE (Whe~n Dalsa S.ln).
PAGE READ INSTRUCTIONSREPORT DOCUMENTATION PAEBEFORE COMIPLETING FORM'
1. P.(PORT NUMBER 17. GOVT ACCESSION NO. 3. RECIPIENTV5 CATALOG NUMBER
GOR/MA/81D-7 I f) -I// 2r4. TITLE (and Subtitle) S. TYPE OF REPORT A 0ERIO0 COVEREG
CCNNFIDENCE TNýTERVALS FOR S YSTEM?.EIIAEL-T71 AND AVAILABILITY OF MAIN,1TAINE MSThsiSYSTEMS USING MONT CAR~LO TECHNIQUEES 6. PERFORMING ORO. REPORT NUMBER
7. AU fNOA(s) 6. CONTRACT OR GRANT mumBIR(.)
Mohamned S. B. HasaballaLt . %Col. Egyptian Army
3. P11FFORMING 0R0AP4IZATION NAME ANO AOOinESS WC. PROGRAM ELEMENT. PROJECT. T ASiCAREA 6 WORK UNIT NUMBERS
Air Force lnstitute of Technology (AFIT)Wriht-?ater'son AFB OH i45L433
11. COPTROLLING, QVTICE N4 &ME ANO AOGRESS 12, REPORT GATR
December 1981I). NUMOER OF PAGES
____ ___ ___ ____ ___ ___ ____ ___ ___ ____ ___ ___ 9914. WAONITORING AGENCY~ NAME 6 AGORESSWf dlitle~enIP6401 CORIP6l11nd OffIC01) It. SECURITY CLASS. (at Ifh~a rP*ft)
Unclassified15A. Otc~ ASSIFICATiON, OOWNGPAOIING
Anproved for publ-Ac release; distribution unlimited
17, DISTRIBUTION STATEMENT (at the *boi,.ct enIto~d m~ block 20. 1( dillftent frV, Rspott)
15APRiU~~COlIJ
W* SUPI'LI[MENT ANY NOTIES ~ ti
Approved -pbi release; 1AW AFR 1 01- 117
19. KEY WOROS (Continue on ravets, oid* It n@40984ey and ldontitf, by block number)
Confiden~ce intervals Mont Carlo Techniques
MaintainabilityAvailability
20. AST(ýACT (Continue on I.tovrs side if necmaceire and jIdentify by, block rnustbt)
,,~This thesis presents the results of an extensive literatu~resppch for finding the confidence intervals for system rel2.a-bility and availability of mnainvtainedc systemms uzing Monti Carlo
tecni~.es. -he charact-eristics of s.ystem reliability and,-ajntai'na'b'-'"-t- analys-s a-e dis-cussed. The basil Markollianfallure anrd repatr models are developed. A su-m.ary of the
oitesti1mates of avallability, reliability of maintained
IJAN71 1473 90ITION OF 1 NOVA IS 13OGSOLETZ fjNCLASSIF! E-DSECURITY CLASISIFICATION OF THIS PAGC 'WI.n D.le garieod)
1;W' ~T-IT ---. SaCUIITY CILASSIFICATION OF ThIS PAGlEC(fe Dare Enteredt)
systems, and the exact analytical methods are presented. Finally,the B00-strap(Double Mont Carlo technique) is used to obtain theconfidence limits for the availability of the maintained systems.
UNCLASTC S I-:JStCjPIT " CLASSIFICATIO94 0 .', PA, ;C : at Es e,,d)