system reliability analysis-uofwaterloo

UN 0701 Engineering Risk and Reliability © M. PandeyUniversity of Waterloo

NSERC-UNENE Industrial Research ChairInstitute for Risk ResearchUniversity of Waterloo

System Reliability Analysis

Mahesh Pandey and Mikko Jyrkama

Fundamentals of Reliability© M. Pandey, University of Waterloo

Outline Introduction

Probabilistic safety analysis (PSA) System reliability analysis

Failure Modes and Effects Analysis (FMEA) Reliability Block Diagrams

Series systems Parallel systems


Introduction


Introduction Most engineering systems consist of many elements or

components Need to consider multiple failure modes and/or multiple

component failures Analysis is fairly complicated Need to consider

1. The contribution of the component failure events to the system’s failure

2. The redundancy of the system3. The post-failure behaviour of a component and the rest of the

system4. The statistical correlation between failure events5. The progressive failure of components


Probabilistic Safety Analysis (PSA) System reliability analysis is an integral part of probabilistic

safety analysis (PSA) in a nuclear power plant The main objective of PSA is to provide a reasonable risk-

based framework for making decisions regarding nuclear power plant design, operation, and siting

The main task is to conduct a reliability analysis for all systems and components in the plant

This requires analysis of all possible failure mechanisms and failure rates for

all systems and components involved quantifying the interaction of the failure mechanisms and their

contribution to overall plant reliability (and safety) PSA also involves other aspects, such as consequence

analysis, uncertainty and sensitivity analyses, etc.


System Reliability Analysis System reliability analysis is conducted in terms of

probabilities The probabilities of events can be modelled as logical

combinations or logical outcomes of other random events Two main methods used include:

Fault tree analysis Event tree analysis

Other qualitative and graphical methods include Failure Modes and Effects Analysis (FMEA) Reliability Block Diagrams (RBD) Functional Logic Diagrams


Failure Modes and Effects Analysis (FMEA)


Failure Modes and Effects Analysis Failure modes and effects analysis (FMEA) is a qualitative

technique for understanding the behaviour of components in an engineered systems

The objective is to determine the influence of component failure on other components, and on the system as a whole

It is often used as a preliminary system reliability analysis to assist the development of a more quantitative event tree/fault tree analysis

FMEA can also be used as a stand-alone procedure for relative ranking of failure modes that screens them according to risk i.e., as a screening tool


FMEA (cont’d)

As a risk evaluation technique, FMEA treats risk in it true sense as the combination of likelihood and consequences

However, strictly speaking, it is not a probabilistic method because it does not generally use quantified probability statements Rather, failure mode occurrences are described using

qualitative statements of likelihood (e.g., rare vs. frequent etc.) Consequences are also ranked qualitatively using levels or

categories e.g., ranging from safe to catastrophic

FMEA uses a rank-ordered scale of likelihood with respect to failure mode occurrence, so that together with the consequence categories, a rank-ordered level of relative risk can be derived for each failure mode


FMEA (cont’d)

FMEA consists of sequentially tabulating each component with all associated possible failure modes impacts on other components and the system consequence ranking failure likelihood detection methods compensating provisions

Failure modes effect and criticality analysis (FMECA) is similar to FMEA except that the criticality of failure is analyzed in greater detail


ExampleExample: Consider the following water heater system used in a residential home. The objective is to conduct a failure modes and effects analysis (FMEA) for the system.


Solution (cont’d)

Define consequence categories as I. Safe – no effect on system II. Marginal – failure will degrade system to some extent but will

not cause major system damage or injury to personnel III. Critical – failure will degrade system performance and/or

cause personnel injury, and if immediate action is not taken, serious injuries or deaths to personnel and/or loss of system will occur

IV. Catastrophic – failure will produce severe system degradation causing loss of system and/or multiple deaths or injuries

The FMEA is shown in the following table


SolutionComponent Failure

ModeEffects on other components

Effects on whole system

Consequence Category

Failure Likelihood

Detection Method

Compensating Provisions

Pressure relief valve

Jammed open

Increased gas flow and thermostat operation

Loss of hot water, more cold water input and gas

I - Safe Reasonably probable

Observe at pressure relief valve

Shut off water supply, reseal or replace relief valve

Jammed closed

None None I - Safe Probable Manual testing

No conseq. unless combined with other failure modes

Gas valve Jammed open

Burner continues to operate, pressure relief valve opens

Water temp. and pressure increase; water turns to steam

III - Critical Reasonably probable

Water at faucet too hot; pressure relief valve open (obs.)

Open hot water faucet to relieve pres., shut off gas; pressure relief valve compensates

Jammed closed

Burner ceases to operate

System fails to produce hot water

I - Safe Remote Observe at faucet (cold water)

Thermostat Fails to react to temp. rise

Burner continues to operate, pressure relief valve opens

Water temp. rises; water turns to steam

III - Critical Remote Water at faucet too hot

Open hot water faucet to relieve pressure; pressure relief valve compensates

Fails to react to temp. drop

Burner fails to function

Water temperature too low

I - Safe Remote Observe at faucet (cold water)


Reliability Block Diagrams


Reliability Block Diagrams Most systems are defined through a combination of both

series and parallel connections of subsystems Reliability block diagrams (RBD) represent a system using

interconnected blocks arranged in combinations of series and/or parallel configurations

They can be used to analyze the reliability of a system quantitatively

Reliability block diagrams can consider active and stand-by states to get estimates of reliability, and availability (or unavailability) of the system

Reliability block diagrams may be difficult to construct for very complex systems


Series Systems Series systems are also referred to as weakest link or chain

systems System failure is caused by the failure of any one component Consider two components in series

Failure is defined as the union of the individual component failures

For small failure probabilities

1 2

where Q denotes the probability of failure


Series Systems (cont’d)

For n components in series, the probability of failure is then

Therefore, for a series system, the system probability of failure is the sum of the individual component probabilities

In case the component probabilities are not small, the system probability of failure can be expressed as

For n components in series


Series Systems (cont’d)

Reliability is the complement of the probability of failure For the two components in series, the system reliability can

be expressed as

Assuming independence

For n components in series

Therefore, for a series system, the reliability of the system is the product of the individual component reliabilities


Parallel Systems Parallel systems are also referred to as redundant The system fails only if all of the components fail Consider two components in parallel

Failure is defined by the intersection of the individual (component) failure events

Assuming independence

1

2


Parallel Systems For n components in parallel, the probability of failure is then

Therefore, for a parallel system, the system probability of failure is the product of the individual component probabilities

The reliability of the parallel system is

For n components in parallel, the system reliability is


Example Problem

Solution: First combine the parallel components 2 and 3 The probability of failure is

The reliability is

Example: Compute the reliability and probability of failure for the following system. Assume the failure probabilities for the components are Q1 = 0.01, Q2 = 0.02 and Q3 = 0.03.

2

31


Solution (cont’d)

Next, combine component 1 and the sub-system (2,3) in series

The probability of failure for the system is then

The system reliability is


Solution (cont’d)

The system probability of failure is equal to

The system reliability is

which is also equal to RSYS = 1 – QSYS

As shown in this example, the system probability of failure and reliability are dominated by the series component 1 i.e. a series system is as good as its weakest link


Things to Consider Reliability block diagrams can also be used to assess

Voting systems (k-out-of-n logic) Standby systems (load sharing or sequential operation)

Simple systems can be assessed by gradually reducing them to equivalent series/parallel configurations

More complex systems would require the use of a more comprehensive approach, such as conditional probabilities or imaginary components

For complex systems, great effort is needed to identify the ways in which the system fails or survives Fault trees can be used to decompose the main failure event

into unions and intersections of sub-events Event trees can be used to identify the possible sequence of

events (also failures)

system reliability analysis-uofwaterloo

Documents

systems failure

effects analysis failure

criticality of failure

catastrophic failure

failure rates

critical failure

marginal failure

consequence analysis