interval estimates of average failure rate and unavailability per demand

Reliability Engineering 14 (1986) 107-121

Interval Estimates of Average Failure Rate and Unavailability per Demand

Julius G o o d m a n

Bechtel Power Corporation, Western Power Division, 12400 E. Imperial Highway, Norwalk, CA 90650, USA

(Received: 27 August 1985)

ABSTRACT

A method of interval estimation of average failure rate and unavailability per demand when data are sparse, vague or bad is proposed. The method utilizes the Bayesian formula with uniJorm and nonunijorm prior distributions depending on available information. The principle of maximum entropy for obtaining 'objective' prior distributions was used.

1 INTRODUCTION

The failure rate, 2, and the unavailability per demand, U, are important characteristics needed for reliability and availability analysis. Un- availability per demand is defined as the limit of the ratio of the number of failures, K, to the number of demands, N, as N tends to infinity:

Similarly, the average failure rate 2 can be defined as:

2 = lim CK'~ (2)

where T is a total number of component hours. (Note that the instantaneous failure rate 2(t) requires more detailed information for its

107

Reliability Engineering 0143-8174/86/$03.50 ~: Elsevier Applied Science Publishers Ltd, England, 1986. Printed in Great Britain

108 Julius Goodman

evaluation (see, for example, Nieuwhof I on this subject) and is not considered in this paper.)

The methods for estimation of reliability parameters when data are plentiful are well known. 2 Sometimes, there is little specific data and much generic data. In this case the Bayesian approach is very instrumental. 3 However, on many occasions when we deal with rare failure modes, or with new equipment or components for which sufficient experience has not yet been accumulated, we face the problem of inference and estimation based on sparse, rare and vague data.

In this paper we will address the problem of the evaluation of the average failure rate and unavailability per demand when few data are available. We consider, separately, cases when data are good or bad. The case when no data are available is beyond the scope of this paper. Nevertheless, some elements of the proposed methodology can be useful for this case also.

2 L I K E L I H O O D DENSITY F U N C T I O N A P P R O A C H

Let x be an evidence and 0 be an unknown population parameter. Then the likelihood L(xlO) is a probability of the evidence x given the parameter 0. On many occasions, we need the evaluation of the parameter 0 given the evidence. The probability density function f(OIx) for the parameter 0 given evidence x is proportional to the likelihood function L(xL 0). However, because the function L(xlO) is not normalized relative to parameter 0, the corresponding density function f(Oix) as a possible option can be derived by normalizing the likelihood:

L(x[O) f (Olx) = ~-L(x-(0i-~t-0 (3)

It is implicitly assumed that we have no prior information whatsoever about the population parameter 0. Mathematically, the formula (3) can be considered as a partial case of the Bayesian formula for noninformative uniform prior distribution n(0):

7~( O)L( x I O) (4) f (OIx) - S ~(O)L(xl O) dO

Failure rate and unavailability per demand estimation 109

If re(0) is constant then eqn (4) is reduced to eqn (3). However, philosophically there is a difference between both formulae. The formula (3) is based only on objective statistical information, while the formula (4) allows incorporation of both objective and subjective information about the prior distribution. That is why we prefer to use a separate name for the approach based on the formula (3)--namely, the likelihood density function approach.

The likelihood function L(k, n JU) for unavailability per demand U given k observed failures during n demands can be presented as a binomial distribution:

(5)

The corresponding likelihood density function f(Ulk,n) takes the form of the beta distribution:

(n+l ) ! u k ( 1 - U ) "-k (6) f (U[k ,n ) - k ! (n - k)!

Similarly, the likelihood function L(k, t [ 2) for the average failure rate )~ for the time period t and corresponding likelihood density function J(Afk, t) can be presented in the form of the Poisson and gamma distributions, respectively:

e-X,(2t) k L(k, t l 2) - k ! (7)

t (2 t ) k e - ~' (8) f( ,~l k, t) - k!

Using the likelihood density functions f ( U l k , n) and f (2 [ k, t) we can develop interval estimates for unavailability U and failure rate 2. An interval estimate is based on a confidence interval corresponding to the confidence level ~ defined as:

°""f(Ofx) dO = 7 (9) low

Substitutingf(U[k, n) orf(2[k, t) forf(O[x), we obtain estimates for U or 2, respectively.

110 Julius Goodman

flU) A Uep f flU) dU =

= f(Uup) Ulow

flUiow )

r....T UIo w Ubest Uup 1

f(U) B Uu p f f(U) dU =

Ulow

1 °Wf(u) dU =f f(U) dU

Uup

!

0 UIo w Ume d Uup 1 U

Fig, 1. A, The best confidence interval; B, the standard confidence interval.

The s tandard confidence interval is usually selected such that the areas of the tails are equal. For U and 2 this yields:

tk, n)dU = ULk,n)dU (10)

f(2Jk, t)d)~= j(2[k,t)d)~ ( l l ) J O -u

Moreover, this interval is not the shortest one given a confidence level 7. Therefore, we have not used our evidence completely and efficiently. The shortest interval est mate can be obtained on the basis of the best confidence interval. 4 The best confidence interval is selected such that the probability density function is the same at the ends:

f(U,o,,. I k, n) = f (Uup I k, n) (12)

f(~o,,, [k, t) = f(2up L k, t) (13)

,~

.m

;>

~ g

2

e ~

~ - ~ ~.~

0", ~ ~ ~ t ~ - ~D O0 ~ O0 ~D ~ ~1

6 6 6 o c~ 6

6 o 6 o o 6

o o o o 6 c~

6 o o o o o

6 o 6 o o 6

~ 0~ ~ °" ~ ~

o o 6 6 o 6

o 6 6 6 o o

o o o o 6 6

I 12 Julius Goodman

The best point estimate is based on the max imum likelihood me thod and for U and 2 it yields:

k Ubest = - (14)

H

k 2b~t . . . . ( 1 5)

l

For unavailability per demand U, the best and s tandard confidence intervals are shown in Fig. 1. The compar ison of the best and s tandard confidence intervals is given in Table 1. As the number of observed failures increases, the distribution (6) becomes more symmetrical and the difference between the best and s tandard intervals diminishes.

3 P R I O R D I S T R I B U T I O N S

Sometimes our knowledge about a prior distr ibution is not enough to develop the exact distribution but is more than enough to justify that the prior distr ibution is not uniform. We assume that we have sufficient informat ion to estimate with some degree of uncertainty the first two m o m e n t s - - m e a n and variance; however, it is not sufficient to establish the type of the distribution. In this situation, the use of the principle of max imum entropy (see, for example Ref. 5) is instrumental.

In informat ion theory, entropy H is a measure of uncertainty, and for the one-dimensional probability density func t ionf (x) , it can be written in the form:

t l + ,Jc

g = - f ( x ) In [/f(x)] dx (16) t

where l is an arbitrary interval that is included in formula (16) to make the a rgument of the logari thm dimensionless.

When only the range of variable x is known, the distribution that maximizes entropy is the uniform distribution. When the range of variable x is unlimited but first two moment s are known, the distr ibution that maximizes entropy is the normal distribution. 6

Why is maximizing ent ropy so impor tan t? Consider another distr ibution with the same mean and variance as the normal distribution. Obviously, entropy (or uncertainty) of the latter distribution is less than for


the normal distribution. This implies that this distribution contains some additional information relative to the normal distribution (for example, information about skewness and kurtosis); however, according to our assumption we know nothing but mean and variance. Therefore, any other distribution except normal will be in conflict with this assumption. This principle can be applied to the choice of the prior distribution: the prior distribution should provide the maximum entropy, given the availabh, in/brmation.

According to Goodman, 7 if we know that the range of the random variable x is the positive half-axis and first two moments are assessable, then the prior distribution is lognormal. In the case when the range of the random variable is finite and also first two moments are known, the prior distribution is the Johnson distribution. 8 Relative to our case, prior distributions ~(2) and re(U) for the average failure rate 2 and unavailability per demand U are:

1 l- 1 ( ln

1 ~(U)

= ~ U ( 1 - U) exp

2-~ V)21 (17)

1 In - /~ (18)

The distribution (17) is a lognormal distribution and the distribution (18) is a particular case of the Johnson distribution. Both distributions are particular cases of the Edgeworth-Kapteyn distribution: 6

1 (19)

For the normal distribution y(x) = x; for the lognormal distribution y ( x ) = l n x ; for the Johnson distribution within a range (a,b), y(x) = In [(x - a)/(b - x)]. The particular case of the Johnson distribution applied to unavailability per demand corresponds to a = 0 and b-- 1.

4 UNCERTAINTY OF PRIOR DISTRIBUTION PARAMETERS

In the case of sparse generic data, parameters # and ~ of prior distributions (17) and (18) have large uncertainty. Actually, we know

1 1 4 Julius Goodman

sample estimates m and S of p and o- based on the sample of the size v. Now, using m, S and v, we have to develop the joint distribution O(/.t, o'lm, S, v) for unknown parameters /~ and o'. A methodology for obtaining q,(/~, ~[ m,S, v) was developed elsewhere. 5 Here we consider briefly the major steps.

Let yi ( i = 1 , 2 , . . . , v) be numbers determined from our data basis according to formulae"

Yi = In

for average failure rates and

(20)

,2,,

for unavailabilities per demand. Then sampling estimates m and S of/.t and a are

v

m = ),~ (22)

i = 1

$2 = _1 ( . v i ) 2 - m 2 (23) V

i = 1

The likelihood function L(m, S, vt#, a) can be presented in the form:

L(m,S, v l # , ~ r ) - ( 2 x ~ e x p - ~ [ S 2 + ( m - ~ ) 2] (24)

and likelihood density function f (# , ~[m, s) determined according to formula (3) and transformed to new variables x and z as follows:

x = v (25)

, p - m (26)

The resulting function has the form:

0(#, al m, S, v) --+ tp( x, z) =IZ 2_ 2(x). dp( z) (27)

where 4)(z) is the standardized normal distribution of the argument z and g I- 2 (x) is the chi-square distribution of the argument x with v - 2 degrees of freedom.

Failure rate and unavailability per demand estimation I 15

The distribution (27) is a distr ibution of unknown populat ion parameters ~ and a given evidence m, S, v. We should not confuse this distribution with a distribution of observed m and S given a sample size v and known populat ion parameters g and a. The last one has the form apparent ly similar to the form of the distribution (27). However, there are some substantial differences. First, the chi-square distribution for S 2 (or x) has v - 1 degrees of freedom. Secondly, r andom variables m and S are completely independent while r andom variables/~ and a are correlated.

Using likelihood functions (5) or (7) and prior distribution (17) or (18), one may present the posterior distributions for 2 and U in the form:

p(2) --f;(21 p, aim, S, v lk, t) (28)

p ( U ) =Jv(UI l~, o l m, S, v[k, n) (29)

The distribution p(2) and p(U) are condit ional distributions of variables 2 and U given generic evidence m, S, v; specific evidence k, t (or k, n); and unknown populat ion parameters of prior distributions /~ and a. Parameters /t and a have distributions ~/J;~(/~,alm, S, v) and Ov(/~, o l m, S, v) for functions p(2) and p ( U ) accordingly.

5 PROBABILISTIC E S T I M A T I O N OF 2 A N D U

Upper and lower estimates of 2 and U can be defined as:

Pr(2 _< 2up ) = ~up (30)

Pr(2 _< 2,ow) = 7~ow (31)

Pr (U < Uup ) = 7up (32)

Pr (U < U,ow) = 7,ow (33)

where 7up and 7,ow are preset confidence levels. All probabilities can be calculated using probability density functions

(28) and (29). Boundary estimates 2up , '~,ow, Uup and Ulo w are functions of parameters ~t and a.

"~up ~--" F). l(~/u p I]./, 0 )

21o w ~- F : 1(~1o w I]~, 0")

U~ow = Fv l(~low I~, 0")

(34)

(35)

(36)

(37)

116 Julius Goodman

where F [ 1 and F v 1 are inverse cumulative functions corresponding to density functions (28) and (29).

Because of uncertainty of parameters p and a, our boundary estimates also have uncertainty. Therefore, one may treat them as chance variables. The procedure of evaluation of uncertain boundary estimates we will call a probabilistic estimation. Using uncertainty distributions ~(#,alm, S, v) and ~Ov(l~,atm, S,v) one can generate uncertainty distributionsfa(217) and fv (UtT) . If we insert 7up or 7~ow instead of 7 we will obtain distributions of upper and lower bounds. Based on distributions f~(2 ] 7) and fe(UI 7) one may report mean, median, some percentile, or an interval estimate of the boundary estimate under consideration. Let us denote fl-fractiles of distributions f ; (2[7) and j v ( U l 7 ) as L~,~ and U~,~. Probabilistic estimates ;t~,~ and U~,,~ have the following meaning: there is 100 x fl per cent chance that inequalities

Pr(2 __ 2~,~) >_ 7 (38)

P r ( U ___ Uz~ ) >_ ?, (39)

are held. Thus, we have here two levels of confidence. To avoid awkward

combinations such as 'confidence of confidence' we will use the term 'assurance' for a higher level of confidence. Therefore, we may say that there is 100 x fl per cent assurance in 100 x 7 per cent confidence that, say, the estimate of 2 will not exceed 2~,,~.

Knowing the boundary estimates, we can construct interval estimates. For example, )~ow,~ and 2~p,~ are ends of a confidence interval with confidence level 7o = 7op - 7~ow and assurance boundary level ft. There are many confidence intervals with confidence 7 and assurance boundary level ft. However, it is always possible to select the shortest confidence intervals. 4

A hierarchic structure of the probabilistic estimates with confidence and assurance is not always practical. Sometimes we prefer a simpler structured one-level assurance interval. To obtain it, one should randomize the confidence level 7. In fact, consider estimates 2 and U as function of 7:

)~ = FF t(7 [ #, a) (40)

U= F~. ~(7[p,a) (41)

For a fixed value 7, uncertainty of p and a will create conditional uncertainty distributions of )~ and U which allow one to determine

Failure rate and unavailability per demand estimation l 17

assurance in a given confidence. If we now consider ~, as a uniform random variable between 0 and 1, then distributions for 7, # and a will create unconditional distributions of 2 and U which allow one to estimate the unconditional assurance interval. We denote them as f~()t) and fu (U) . These distributions ~envelop' conditional distributions f~(21~) and Fv(UfT).

6 BAD DATA T R E A T M E N T

Estimates (14) and (15) in Section 2 assume that component-hours t, number of demands n and number of failures k per n demands (or per t component-hours) are known exactly. However, in some cases component-hours or number of demands are known only approximately or a record of failures is incomplete or incorrect. Therefore, o u r evidence is vague and we have to deal with bad data.

We can account for bad data by assuming uncertainty in evidence parameters t, n and k. These uncertainties can be treated in the same fashion as uncertainties of generic parameters /~ and a in the previous section. First of all, we have to develop uncertainty distributions ~9~(k, t) and Cu(k, n). To do this we may use the principle of maximum entropy. Then, we have to write likelihood density functions (6) and (8) for given evidence parameter k, t or k, n and corresponding cumulative functions:

F;()~lk, t) --- 7 (42)

F~(U[k,n) =~, (43)

Inverting these functions we obtain probabilistic estimates for ~ and U:

). = F~ -1 (7]k, t) (44)

U = Ft~ 1(7 ] k ,n) (45)

We may present these estimates in two formats: 'assurance of confidence' format or 'unconditional assurance' format. To illustrate this we consider a numerical example. Assume that our record shows that a component has not failed for 9 × 10 6 component-hours. Analysis has shown that accuracy of the component-hours record is about 10 per cent. However, the record of component failures is exact because reporting of component failures is required by regulations.

Based on this information and taking into account the principle of

118 Julius Goodman

maximum entropy, we adopt a lognormal distribution for component- hours t with the following parameters:

# = In(9 x 106) = 16.01274 (46)

o- = 0.1 (47)

The likelihood density function according to formula (8) for k = 0 can be written as:

f(21 t) = t e -,.t (48)

and the corresponding cumulative function is:

F(21 t) = 1 - e - ;" (49)

In the 'assurance of confidence' format we are searching for such a number 27, ~ wherein the probability that F(2~,el t) ___ y is equal to fl:

Pr {[1 - exp ( - 2;,,t~t)] > ~,} = fl (50)

For computing 2r,e the computer code ASCON utilizing the Monte Carlo simulation technique was applied. Particularly, for ~ = 0-95 and fl = 0.95 we obtained:

20.95,0.95 = 3.924 x 10 - 7 per component-hour (51)

In the 'unconditional assurance' format we have to present 2 as a function of the confidence level y and parameter t. Using formula (49) we obtain:

1 2 = - I n ( l - - ; ) (52)

t

The parameter t has a lognormal distribution and the confidence level 7 has the uniform distribution. Utilizing the computer code P R O Z based on the Monte Carlo method the distribution for the failure rate 2 was calculated. The result of the computat ion is shown in Table 2. It is interesting to compare the results of our evaluation with a so-called 'conservative point estimate'. Conservatively, assuming that k equals I and using formula (15) with t = 9 x 106 component-hours we obtain

2 = 1.111 x 10 - 7 per component-hour (53)

This estimate is slightly less than the mean and three times less than the upper bound presented in Table 2. Even using a lower limit for t we come up with 2 = 1.310 x 10-v per component-hour which is about the 67th


TABLE 2 The Distribution of the Failure Rate 2

for Parameters k = 0 , / ~ = 16 .01274 ,

a = 0 . 1

Percentile Frequency per component-hour

5 6 . 0 6 0 9 2 7 E - 09

10 1 . 1 5 1 6 4 E - 08

15 1 . 9 4 5 7 5 3 E - 08

20 2 . 5 7 4 6 9 7 E - 08

25 3 . 3 2 1 1 9 7 E - 08

30 4 . 0 4 9 3 2 E - 08

35 4 - 8 2 9 0 8 2 E - 08

40 5 . 5 2 5 8 2 3 E - 08

45 6 . 6 3 2 4 8 3 E - 08

50 7 " 9 0 8 9 9 5 E - 08

55 9 . 2 9 4 9 5 9 E - 08

60 1 . 0 6 1 3 5 2 E - 07

65 1 . 2 0 2 2 7 8 E - 07

70 1 . 3 5 5 5 0 1 E - 07

75 1 . 6 2 0 5 7 8 E - 07

80 1 . 8 8 3 3 2 1 E - 07

85 2 - 2 3 4 4 0 5 E - 07

90 2 . 7 8 3 9 9 8 E - 07

95 3 . 3 6 4 2 6 4 E - 07

Mean 1 -159953 E - 07

percentile of the distribution shown in Table 2 and three times less than

~0"95,0"95' We considered above the case with the uniform prior distribution.

However, we may consider the case with a vague prior distribution in the same manner as was done in Section 5. Then formulae (44) and (45) will be generalized as:

2 = F~.-l(?,fk, tilt, a) (54)

U = F~ l(TIk,nllt , a) (55)

and additional distribution for It and o should be applied. For example, formula (50) takes the form:

Pr { F;.(2~,/~p k, t] it, a) _> 7} = fi (56)

120 Julius Goodman

where the probability is computed with the joint distribution function of random variables k, t, #, a.

7 SU MMARY

In this paper we considered several typical cases of failure rate and unavailability per demand estimation when few data are available. The first case refers to the situation when few data are known and there is no prior information about 2 or U. In this case, the application of the likelihood density function is recommended to obtain the interval estimate.

In the second case, we have an additional prior information. This information is not complete enough to develop a well-defined empirical prior distribution but it is definite enough to abandon the noninformative uniform prior distribution. In this case use of the principle of maximum entropy is recommended to obtain the analytical form of the prior distribution and, then to apply the Bayesian formula. However, there are two kinds of uncertainty we have to deal with: uncertainty of specific and generic data. The first kind of uncertainty is reflected by a likelihood function and the second kind by distributions of prior distribution function parameters. Two formats for treating uncertainties were proposed.

The third case addresses a situation with vague or bad data. It was proposed to treat bad data as uncertainty of evidence. Therefore, instead of exact numbers for failures, k, number of demands, n, and component- hours, t, the corresponding uncertainty distributions are used. The likelihood function becomes conditioned to random values of these numbers. It is possible to use uniform and nonuniform but vague prior distributions to obtain a posterior distribution. The handling of uncertainties is the same as for the second case.

The proposed methodology points to the fact that the uncertainties of the estimates for 2 and U are larger than is usually recognized. This increased uncertainty is the price one pays for sparse, vague, and bad data.

REFERENCES

1. Nieuwhof, G. W. E. Matching a failure probability distribution to test data, Reliability Engineering, 11 (1985), pp. 163-74.


2. von Alven, W. H. (Ed.) Reliability Engineering, prepared by ARINC Research Corporation, Prentice-Hall, Englewood Cliffs, New Jersey, 1964.

3. Apostolakis, G., Kaplan, S., Garrick, B. J. and Duphily, R. J. Data specialization for plant specific risk studies, Nuclear Engineering and Design, 56 (1980), pp. 321-9.

4. Goodman, J. On the definition of the 'best' confidence interval, Reliability Engineering, 7 (1984), pp. 213-28.

5. Goodman, J. Estimating fragility curves using few experimental data, Proceedings of the Symposium on Advances in Probabilistic Structural Mechanics at the 1984 Pressure Vessel and Piping Conference and Exhibition, San Antonio, Texas, 17-21 June, 1984, PVP--Vol. 93, pp. 41 52.

6. Korn, G. A., and Korn, J. M. Mathematical Handbook for Scientists and Engineers, McGraw-Hill, New York, 1968.

7. Goodman, J. Structural fragility and principle of maximum entropy, Structural Safety (to be published).

8. Johnson, N. L. Systems of frequency curves generated by methods of translation, Biometrika, 36 (1949), p. 149.

interval estimates of average failure rate and unavailability per demand

Documents