presented by: mark e. sims reliability s&t engineer aviation and missile research, development...

Presented by:

Mark E. SimsReliability S&T Engineer

Aviation and Missile Research, Development and Engineering Center

UNCLASSIFIED

Intro Reliability GrowthIntro Reliability Growth

"Approved for public release; distribution unlimited. Review completed by the AMRDEC Public Affairs Office 11 Oct 2013; PR0073."

2

Mil-HDBK-189 DefinitionMil-HDBK-189 Definition

Reliability GrowthThe positive improvement in a reliability parameter

over a period of time

due to changes in product design

or the manufacturing process.

MIL-HDBK-189 is a Department of Army Handbook for Reliability Growth Management

3

J.T. Duane was an engineer at the Aerospace Electronics Department of the General Electric Company.

He published a paper in 1964 that applied a “learning curve approach” to reliability monitoring.

He observed that the cumulative MTBF versus cumulative operating time followed a straight line when plotted on log-log paper.

The learning (i.e., growing) is accomplished through a “test, analyze, and fix” (TAAF) process.

Beginnings

Design

TestFailure

Analysis

Identified Deficiencies

4

log-log paper graphing

Normal graphing

1 10 100 1000 100001

10

100

1000Reliability Growth Chart

Cumulative Duane

Instantaneous Duane

Test Hours

MT

BF

0 200 400 600 800 1000 1200 1400 16000

20

40

60

80

100

120Reliability Growth Chart

Cumulative Duane

Instantaneous Duane

Test Hours

MT

BF

..

..

.

. .. .

.

Graphs

Duane Postulate:The cumulative MTBF versus cumulative operating time is a straight line on log-log paper.

5

Continuous GrowthContinuous Growth

0 500 1000 1500 2000 2500 3000 3500 4000 45000.000

0.001

0.002

0.003

0.004

0.005

0.006

0.007

0.008

0.009

Reliability Growth Chart

Test Hours

Fa

ilure

Ra

te

Continuous means time.

0 500 1000 1500 2000 2500 3000 3500 4000 45000

100

200

300

400

500

600

700

Reliability Growth Chart

Test Hours

MT

BF

You can plot failure rate or MTBF against the total test hours.

6

Discrete GrowthDiscrete Growth

0 20 40 60 80 100 120 140 160 180 20060.0%

65.0%

70.0%

75.0%

80.0%

85.0%

90.0%

95.0%

Reliability Chart

Trials

Re

lia

bil

ity Discrete means trials.

7

Discrete GrowthDiscrete Growth

0 20 40 60 80 100 120 140 160 180 20060.0%

65.0%

70.0%

75.0%

80.0%

85.0%

90.0%

95.0%

Reliability Chart

Trials

Re

lia

bil

ity

Reliability Growth follows a Learning Curve approach.

Note: More rapid growth occurs earlier in the process then flattens out!

8

Why Reliability Growth?

9

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

16 17 18 19 20 21 22 23 24 25 26 27 28 29 30

31 32 33 34 35 36 37 38 39 40 41 42 43 44 45

46 47 48 49 50 51 52 53 54 55 56 57 58 59 60

61 62 63 64 65 66 67 68 69 70 71 72 73 74 75

76 77 78 79 80 81 82 83 84 85 86 87 88 89 90

91 92 93 94 95 96 97 98 99 100 101 102 103 104 105

106 107 108 109 110 111 112 113 114 115 116 117 118 119 120

121 122 123 124 125 126 127 128 129 130 131 132 133 134 135

136 137 138 139 140 141 142 143 144 145 146 147 148 149 150

151 152 153 154 155 156 157 158 159 160 161 162 163 164 165

166 167 168 169 170 171 172 173 174 175 176 177

Example

A System has 18 Failures in 177 Trials

10

Example

A system has 18 failures in 177 trials. The failures are listed the tables below.

Failure Trial

1 6

2 7

3 14

4 16

5 26

6 30

7 38

8 39

9 51

Failure Trial

10 55

11 64

12 71

13 79

14 98

15 108

16 129

17 145

18 148

11

Failure TrialTrials

Between Failures

1 6 6

2 7 1

3 14 7

4 16 2

5 26 10

6 30 4

7 38 8

8 39 1

9 51 12

There appears to be reliability growth.

Example

Failure TimeTrials

Between Failures

10 55 4

11 64 9

12 71 7

13 79 8

14 98 19

15 108 10

16 129 21

17 145 16

18 148 3

Less trials between failures.More trials between failures.

12

0 200 400 600 800 1000 12000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Trials

Reliability

0.9254

Example

Applying Reliability Growth Methodology, we get the following curve:

.

13

0 200 400 600 800 1000 12000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Trials

Reliability

Note: Reliability without applying growth is1 – (18 / 177) = 0.8983

0.9254

Example

Applying Reliability Growth Methodology, we get the following curve:

.

14

Why Reliability Growth?Saves Assets

Reduces Test TimeSaves $$$$$$$

15

Duane ModelPower Law Formulation

for reliability growth

16

Duane PostulateDuane Postulate

During Reliability Growth,Graphing the log of time (or tests) against its corresponding log of MTBF

Will be a straight line with slope α.

MTBFCum = Cumulative Mean-Time-Between-Failure

t = Time

K = Constant for Power Law Equation

α = Growth parameter

17


Slope, α

Time (or Trial), t

MT

BF

(Ln(t1), Ln(M1))

(Ln(t3), Ln(M3))

(Ln(t2), Ln(M2))

Times MTBFCum

t1 M1

t2 M2

t3 M3

18


Linear relationship:y = αx + b

Has a linear log-log relationship!

19

Calculating αthe growth rate

Calculating αthe growth rate

20

Calculating α (the growth rate)


Time (hrs)

Total Failures

First reading 500 5

Last reading 4000 20

We will determine α from these two readings.

21

We will determine α from these two readings.

Time (hrs)

Total Failures

First reading 500 5

Last reading 4000 20



22

First calculate the cumulative MTBF for each reading.

Time (hrs)

Total Failures

MTBF

First reading 500 5 100

Last reading 4000 20 200



23

Time (hrs)

Total Failures

MTBFLn(Time) Ln(MTBF)

First reading 500 5 100 Ln(500) Ln(100)

Last reading 4000 20 200 Ln(4000) Ln(200)

Take logs of the readings.



24

Slope, α

x-axis

y-a

xis

( Ln(500) , Ln(100) )

( Ln(4000) , Ln(200) )

Time (hrs)

Total Failures




Plot the logs of the readings.



25

α = 0.33

x-axis

y-a

xis

( 6.215 , 4.605 )

( 8.294 , 5.298 )

Time (hrs)

Total Failures






26

α = 0.33

x-axis

y-a

xis

Growth is indicated when 0 < α < 1



27

Duane ParametersDuane Parameters

α = Growth parameterTI = Initial test timeMI = Initial MTBFMF = Final MTBFTtotal = Total time

• These parameters go into the Duane equation.

• If you know 4 of the parameters, you can calculate the other.

28

Sensitivity of αSensitivity of α


What is the Total Test time if we are given these 4 parameters?

α .40

TI 100

MI 50

MF 150

TTotal ?

29



How does changing the growth parameter α affect the total test time?

α .40

TI 100

MI 50

MF 150

Ttotal 435

30

α .40 .27 .46 .64

TI 100 100 100 100

MI 50 50 50 50

MF 150 150 150 150

Ttotal 435




31

α .40 .27 .46 .64

TI 100 100 100 100

MI 50 50 50 50

MF 150 150 150 150

Ttotal 435 1823 285 113



The α is very sensitive to the Total Time!


32

Instantaneousvs

Cumulative

Instantaneousvs

Cumulative

Duane MTBF Equation

Finding the true estimate of a system’s MTBF using reliability growth.

33

FailureNumber

Failure Time

1 10

2 40

3 90

4 160

5 250

Inst vs. Cum MTBF

What is the true estimate of the MTBF at 250 hours?

34

FailureNumber

Failure Time

MTBFCum

1 10 10

2 40 20

3 90 30

4 160 40

5 250 50

Inst vs. Cum MTBF

Is the MTBF 50 at time 250?

35

FailureNumber

Failure Time

MTBFCum

Time Between Failures

1 10 10 10

2 40 20 30

3 90 30 50

4 160 40 70

5 250 50 90

Inst vs. Cum MTBF

Or would you say the MTBF is 90 at 250 hours?

36

FailureNumber

Failure Time

MTBFCum


MTBFInst

1 10 10 10 31

2 40 20 30 43

3 90 30 50 52

4 160 40 70 59

5 250 50 90 66

Inst vs. Cum MTBF

Applying a Reliability Growth Tracking Model from AMSAA or ReliaSoft’s RGA software tool will give these numbers.

37

FailureNumber

Failure Time

MTBFCum


MTBFInst

1 10 10 10 31

2 40 20 30 43

3 90 30 50 52

4 160 40 70 59

5 250 50 90 66

Inst vs. Cum MTBF

Applying a Reliability Growth Tracking Model from AMSAA or ReliaSoft’s RGA software tool to get these numbers.

So, 66 is the true MTBF at 250 operating hours, if reliability growth is occurring.

38

MTBFInst

MTBFCum

MT

BF

Time (or Test), t

On Log-Log Graph Paper

10 100 1000 10,000

10

100

Inst vs. Cum MTBF

39

This is how the graphs lookIn standard Cartesian coordinate

MTBFInst

MTBFCum

MT

BF

Time (or Test), t

300

100

500 1000 1500 2000

200

Inst vs. Cum MTBF

40

ExerciseExercise

10 system failures occurred after 500 hours of reliability growth testing, with a calculated growth parameter of 0.40.

What is the system’s instantaneous MTBF?

41



ExerciseExercise

42



ExerciseExercise

43

Reliability Growth FormulasReliability Growth Formulas

Failure Rate

MTBF

Reliability

44

M(t) = 1 / r(t)

MTBF is the reciprocal of the failure rate.

45

rI = Initial failure rate

tI = Initial time corresponding to rI

α = Growth rate parameter

Failure Rate FormulaFailure Rate Formula

Initial Conditions

46

MI = Initial MTBF

tI = Initial time corresponding to MI


MTBF FormulaMTBF Formula

Initial Conditions

47

RI = Initial Reliability

NI = Initial number of trials corresponding to RI


Reliability (Discrete)Reliability (Discrete)

Initial Conditions

48

Deriving r(t) FormulaDeriving r(t) Formula

r(t) is sometimes called the Hazard Rate.

49


K = Constant for Power Law Equation

First, start with the Duane Postulate.

50

Insert initial conditions MI at TI , and solve for K.


tI is the Initial Test Time.MI is the Initial MTBF at time tI.

51

Now substitute for K.


52

The failure rate, r, is the inverse of the MTBF, so r(t) = 1 / M(t).


53


Now we will simplify and take the derivative.

54


Now we will simplify and take the derivative.

55

MI = Initial MTBF

tI = Initial time corresponding to MI


Deriving M(t) FormulaDeriving M(t) Formula

56

Deriving M(t) FormulaDeriving M(t) Formula

Recall MTBF = 1/r, so take the inverse of r(t).

57

The Sensitivity of Duane’s Initial Conditions TI and MI on the Total Test Time.

58

TI 100 150 200 250

α .40 .40 .40 .40

MI 50 50 50 50

MF 150 150 150 150

Ttotal 435 ??? ??? ???


What if we increase the initial time for a planning curve?

Sensitivity of Initial TimeSensitivity of Initial Time

59

Sensitivity of Initial TimeSensitivity of Initial Time

What if we increase the initial time for a planning curve?

A higher initial time significantly increases Ttotal!

TI 100 150 200 250

α .40 .40 .40 .40

MI 50 50 50 50

MF 150 150 150 150

Ttotal 435 652 869 1087

Why?

60

250 500 750 1000

50

100

150

TI 100 250

α .40 .40

MI 50 50

MF 150 150

Ttotal 435 1087

Sensitivity of Initial Time

Time

MT

BF

TI

61

250 500 750 1000

50

100

150

TI 100 250

α .40 .40

MI 50 50

MF 150 150

Ttotal 435 1087

Growth is more rapid the smaller TI is!

Sensitivity of Initial TimeM

TB

F

Time

TI

62

MI 50 25 70 85

α .40 .40 .40 .40

TI 100 100 100 100

MF 150 150 150 150

Ttotal 435 ??? ??? ???


What if we change the initial MTBF for a planning curve?

Sensitivity of Initial MTBF

63

What if we change the initial MTBF for a planning curve?

A higher initial MTBF significantly decreases Ttotal!

MI 50 25 70 85

α .40 .40 .40 .40

TI 100 100 100 100

MF 150 150 150 150

Ttotal 435 2459 187 115

Sensitivity of Initial MTBF

64


NI = Initial number of trials corresponding to RI


Deriving Reliability FormulaDeriving Reliability Formula

65


Rcum = Cumulative Reliability

F = Number of Failures

N = Number of Trials

r is the failure rate.

66


Recall failure rate formula.

67


Subtract from 1.

68


Make substitutions.

69

Initially, System A has 3 failures after 100 firings.

If you expect a growth rate of 0.25, what would be the expected reliability after 1000 flight tests?

Exercise

70

Initially, System A has 3 failures after 100 firings.

If you expect a growth rate of 0.25, what would be the expected reliability after 1000 flight tests?

Exercise

71

Inst / Cum Conversions

72

AMSAA-Crow ModelProjection Method

for reliability growth planning

73

- 50 100 150 200 250 300

0.84

0.86

0.88

0.90

0.92

0.94

0.96

0.98

1.00

RG PotentialRGP = 0.9747

TL

CAP4

CAP3

CAP2

CAP1

RG = 0.9639

RLUT = 0.9568

RDT3 = 0.9455

RDT2 = 0.9260

RDT1 = 0.8987

PM2-Discrete Reliability Growth Planning Curve

Idealized Curve DT1 DT2 DT3 LUT IOT Requirement

Trials

Rel

iab

ilit

y

RR = 0.9200

Discrete PM2 Growth Plan Example

041712-Sims-Reliability Growth (TE Class)

74

Continuous PM2 Growth Plan Example

-

500

1,0

00

1,5

00

2,0

00

2,5

00

3,0

00

3,5

00

4,0

00

4,5

00

0

100

200

300

400

500

600

700

CAP10CAP9CAP8CAP7CAP6CAP5

CAP4

CAP4

CAP3

CAP3

CAP2

CAP2

CAP1

CAP10

PM2 Continuous Reliability Growth Planning Curve

Idealized Curve Series3 Series5 Series7 Series9

Series11 Series12 Series13 Series14 Series15

Series16 Hypothetical Last Step IOT Series22 Requirement

Test Time (hours)

MT

BF

322

DT1

DT2

DT3

MG,DT = 581

LUTMG,0T = 523

MI = 190

500

MR = 200

MGP = 782

415

041712-Sims-Reliability Growth (TE Class)

75

Continuous Curve EquationContinuous Curve Equation

Continuous curve is plotted using this equation.

MTBF(T) = System Mean-Time-Between-Failures at time T

MTBFI = Initial MTBF

MS = Management Strategy

µ = Average Fix Effectiveness Factor (FEF)

β = Shape parameter

76

R(N) = System Reliability at trial N.

RA = The portion of the system reliability not impacted by the correction action effort

RB = The portion of the system reliability addressed by the correction action effort

MS = Management Strategy

µ = Average Fix Effectiveness Factor (FEF)

n = Shape parameter of the beta distribution representing pseudo trials

Discrete Curve EquationDiscrete Curve Equation

Discrete curve is plotted using this equation.

77

Management Strategy Factor

Management Strategy (MS) is the fraction of the overall system failure rate to be address by the corrective action plan.

λ = Failure rate.

For various reasons (prohibitive cost, improbability of reoccurrence), some failure modes will not have a corrective action.

78


A-Mode: Failures that are not fixed.B-Mode: Failures that will have a fix.

Failure Rates

A-ModeB-Mode

A “fix” means a reliability improvement corrective action, not just a remove and replace of the same component.

79


λA = Failure rate of A-modesλB = Failure rate of B-modesλA + λB = Overall system failure rate

Failure Rates

A-ModeB-Mode

80


Failure Rates

A-ModeB-Mode

Example: What is the MS here?

Failure mode

Failure mode rate

Mode Type

1 0.027 B

2 0.015 B

3 0.033 B

4 0.001 A

5 0.013 B

81


Failure Rates

A-ModeB-Mode

Example: What is the MS here?

Failure mode

Failure mode rate

Mode Type

1 0.027 B

2 0.015 B

3 0.033 B

4 0.001 A

5 0.013 B

Total B-modes 0.088

Total System 0.089

82

μ, Fix Effectiveness Factor

Mil-HDBK-189 Definition:

Fix Effectiveness Factor, μ = A fraction representing the reduction in an individual initial mode failure rate due to implementation of a corrective action.

Essentially Fix Effectiveness Factors discount failures. A couple examples will follow.

83

Number of tests = 20

Successful tests = 18

Hardware Failure

Software Failure

What is the reliability?

X

X


84



X

X

Software Failure

Hardware Failure


85



X

X

What is the updated reliability?

μ1 = 100%

μ2 = 75%

Software Failure

Hardware Failure


86



Hardware

Software

X

X100% Fix

75% Fix


87

Failure mode

Failure mode rate

Mode Type

1 0.027 B

2 0.015 B

3 0.033 B

4 0.001 A

5 0.013 B


Another Example: Say the average μ is 0.75 (or 75%). What is the updated System Failure Rate?

λA = 0.001λB = 0.088λSystem = 0.089

88

Failure mode

Failure mode rate

Mode Type

1 0.027 B

2 0.015 B

3 0.033 B

4 0.001 A

5 0.013 B


Another Example: Say the average μ is 0.75 (or 75%). What is the updated System Failure Rate?

OriginalλA = 0.001λB = 0.088λSystem = 0.089

UpdatedλA = 0.001λB = 0.088 * (1- 0.75) = 0.022λSystem = 0.023

89

Shape Parameter, βShape Parameter, β

β = Shape parameter

TT = Total Test Time

MG = MTBF Goal

MGP = MTBF Growth Potential

MI = Initial MTBF

90

η = Shape parameter of the beta distribution representing pseudo trials

NT = Total Number of Trials

RG = Reliability Goal

RGP = Reliability Growth Potential


Shape Parameter, βShape Parameter, β

91

Growth PotentialGrowth Potential

MGP = MTBF Growth PotentialThe theoretical upper limit on MTBF

92

Growth PotentialGrowth Potential

MGP = MTBF Growth PotentialThe theoretical upper limit on MTBF

For example:MS = 0.95μ = 0.80MI = 190

93

RA = The portion of the system reliability not impacted by the correction action effort

PM2 Curve EquationPM2 Curve Equation

MS = Management Strategy. Fraction of failures to be addressed by corrective action.

Medium Risk Range 0.90 – 0.96.


94

RB = The portion of the system reliability addressed by the correction action effort

PM2 Curve EquationPM2 Curve Equation

MS = Management Strategy. Fraction of failures to be addressed by corrective action.

Medium Risk Range 0.90 – 0.96.


95

Management Strategy FactorManagement Strategy Factor

A-Mode: Failures that are not fixed.B-Mode: Failures that will have a fix.

λ = Failure rate.

Fraction of failures to be addressed by the corrective action plan.

96

n = Shape parameter of the beta distribution representing pseudo trials

PM2 Growth PlanPM2 Growth Plan

RGP = Reliability Growth PotentialRG = Reliability Goal (to meet requirement)NT = Total trials before going into IOT phase

97

Reliability Growth PotentialReliability Growth Potential

RGP = Reliability Growth PotentialThe theoretical upper limit on system reliability

98

Reliability Growth PotentialReliability Growth Potential

RGP = Reliability Growth PotentialThe theoretical upper limit on system reliability

For example:MS = 0.95μ = 0.80MI = 190

99

Summary

• Reliability Growth applies a “Learning Curve” Approach

• System must undergo Test-Analyze-And-Fix for reliability to grow.

• Initial Conditions are sensitive to a growth plan.

100

ASMSA-Crow/Duane EquationsASMSA-Crow/Duane Equations

1. Single Shot Systems Expected Failures:

NNR

NEF

1

)(1)(

2. Continuously Operating Systems Expected Failures:

TTM

TEF

1

)(1

)(

presented by: mark e. sims reliability s&t engineer aviation and missile research, development...

Documents

reliability growth slide

growth rate slide

growth parameter slide

total time

log of time

reliability growth methodology

rapid growth

definition reliability