clemens fta tutorial

96
Sve rd rup 1 FAULT TREE ANALYSIS P. L. Clemens May 1993 4 th  Edition provided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

Upload: alenov5

Post on 13-Oct-2015

152 views

Category:

Documents


22 download

DESCRIPTION

fault tree theory

TRANSCRIPT

  • Sverdrup1

    FAULTTREE

    ANALYSIS

    P. L. ClemensMay 1993

    4th Edition

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • TOPICS COVERED . . .Fault Tree DefinitionDeveloping the Fault TreeStructural Significance of the AnalysisQuantitative Significance of the AnalysisDiagnostic Aids and ShortcutsFinding and Interpreting Cut Sets and Path SetsSuccessDomain Counterpart AnalysisAssembling the Fault Tree Analysis ReportFault Tree Analysis versus AlternativesFault Tree Shortcomings / Pitfalls / Abuses

    2

    All fault trees appearing in this training module have been drawn, analyzed, and printedusing FaultrEASE , a computer application available from: Arthur D. Little, Inc. / AcornPark / Cambridge, Massachusetts, 02140-2390 Phone (617) 864-5770.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FIRST A BIT OFBACKGROUND . . .3

    Origins of the techniqueFault Tree Analysis definedWhere best to apply the techniqueWhat the analysis producesSymbols and conventions

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ORIGINS . . .

    Fault tree analysis was developed in1962 for the U.S. Air Force by BellTelephone Laboratories for use withthe Minuteman systemwas lateradopted and extensively applied bythe Boeing Companyis one of manysymbolic logic analytical techniquesfound in the operations researchdiscipline.

    4

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • THE FAULT TREE IS . . .

    a graphic model of the pathways within asystem that can lead to a foreseeable, undesirableloss event. The pathways interconnectcontributory events and conditions, using standardlogic symbols. Numerical probabilities ofoccurrence can be entered and propagated throughthe model to evaluate probability of theforeseeable, undesirable event.

    only one of many System Safety analytical toolsand techniques.

    5

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FAULT TREE ANALYSIS IS BESTAPPLIED TO CASES WITH . . .

    Large, perceived threats of lossi.e., high risk.Numerous potential contributors to a mishap.Complex or multi-element systems/processes.Already-identified undesirable events. (A must!)Indiscernible mishap causes (i.e., autopsies).

    6

    CAVEAT: Large fault trees are resource-hungry and shouldnot be undertaken without reasonable assurance of need.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FAULT TREE ANALYSIS PRODUCES . . .Graphic display of chains of events/conditions

    leading to the loss event.

    Identification of those potential contributors tofailure that are critical.

    Improved understanding of systemcharacteristics.

    Qualitative/quantitative insight into probability ofthe loss event selected for analysis.

    Identification of resources committed topreventing failure.

    Guidance for redeploying resources to optimizecontrol of risk.

    Documentation of analytical results.

    7

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME DEFINITIONS . . .FAULT: An abnormal, undesirable state of a system or a system element* induced (1)by presence of an improper command or absence of a proper one, or (2) by a failure (seebelow). All failures cause faults; not all faults are caused by failures. A system whichhas been shut down by safety features has not faulted.FAILURE: Loss, by a system or system element*, of functional integrity to perform asintended. E.g., relay contacts corrode and will not pass rated current when closed, or therelay coil has burned out and will not close the contacts when commanded the relayhas failed ; a pressure vessel bursts the vessel fails. A protective device whichfunctions as intended has not failed e.g., a blown fuse.

    Primary (or Basic) Failure: The failed element has seen no exposure toenvironmental or service stresses exceeding its ratings to perform. E.g., fatiguefailure of a relay spring within its rated lifetime; leakage of a valve seal within itspressure rating.Secondary Failure: Failure induced by exposure of the failed element toenvironmental and/or service stresses exceeding its intended ratings. E.g., thefailed element has been improperly designed, or selected, or installed, or calibratedfor the application; the failed element is overstressed/underqualified for its burden.

    8

    *System element: a subsystem, assembly, component, piece part, etc.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ASSUMPTIONS AND LIMITATIONS . . .

    Nonrepairable system.No sabotage.Markov

    Fault rates are constant = = KThe future is independent of the past i.e.,

    future states available to the system dependonly upon its present state and pathways nowavailable to it, not upon how it got where it is.

    BernoulliEach system element analyzed has two,

    mutually exclusive states.

    9

    MTBF1

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • THE LOGIC SYMBOLS . . .

    10

    TOP Eventforeseeable, undesirable event, towardwhich all fault tree logic paths flowor

    Intermediate Eventdescribing a system stateproduced by antecedent events.

    AND

    OR

    AND Gateproduces output if all inputs co-exist. All inputs,collectively, must be (1) necessary and (2) sufficient tocause the output event.

    OR Gateproduces output if any input exists. Any input,individually, must be (1) necessary and (2) sufficient to causethe output event.

    Basic EventInitiating fault/failure, not developedfurther. (Called Leaf, Initiator, or Basic.)The Basic Event marks the limit of resolutionof the analysis.

    EVENTS and GATES are NOT component parts of the system being analyzed. They aresymbols representing the logic of the analysis. They are bi-modal. They function flawlessly.

    Most Fault Tree Analysescan be carried out usingonly these four symbols.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • STEPS IN FAULT TREE ANALYSIS . . .

    11

    6. Repeat / continue

    4. Identify second-levelcontributors.

    5. Link second-levelcontributors to TOP bylogic gates.

    1. Identify undesirable TOPevent.

    3. Link contributors to TOPby logic gates.

    2. Identify first-levelcontributors.

    Basic Event(Leaf,Initiator, or Basic)

    indicates limit of analyticalresolution.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME RULES AND CONVENTIONS . . .

    12

    NOYES

    Do usesingle-stem

    gate-feed inputs.

    Don't let gatesfeed gates.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • MORE RULES AND CONVENTIONS . . .

    Be CONSISTENT in naming fault events/conditionsuse same name for sameevent/condition, throughout the analysis.(Use index numbering for large trees.)

    Say WHAT failed/faulted and HOWe.g.,Switch Sw418 Contacts Fail Closed

    Don't expect miracles to save thesystem. Lightning will not recharge thebatterya large bass will not plug thehole in the hull.

    13

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME CONVENTIONS ILLUSTRATED . . .

    14

    Initiators must be Statistically Independentof one another.NAME BASICS CONSISTENTLY !

    FLAT TIREAIRESCAPESFROMCASING TIREPRESSUREDROPS TIREDEFLATES

    ?

    NO MIRACLES !

    MAYBE a gust of wind will come

    along and correct theskid.

    a sudden cloudburstwill extinguish theignition source.

    therell be a poweroutage when theworkers hand contactsthe high-voltageconductor.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • IDENTIFYING TOP EVENTS . . .

    Explore historical records (own & others).Look to energy sources.Identify potential mission failurecontributors.

    Develop whatif scenarios.Use shopping lists.

    15

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EXAMPLE TOP EVENTS . . .

    16

    Wheels-up landing

    Mid-air collision

    Subway derailment

    Turbine engine FOD

    Rocket failure to ignite

    Dengue fever pandemic

    Sting failure

    Inadvertent nuke launch

    Reactor loss of cooling

    Uncommanded ignition

    Irretrievable loss of primary test data

    Inability to dewater buoyancy tanks

    TOP events represent potential high-penalty losses (i.e., high risk). Eitherseverity of the outcome or frequency of occurrence can produce high risk.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SCOPE THE TREE TOP . . .

    17

    TOO BROAD IMPROVED Computer Outage Outage of Primary Data

    Collection computer, exceeding8 hours, from external causes

    Exposed Conductor Unprotected body contact withpotential greater than 40 volts

    Foreign Object Ingestion Foreign object damage byobject weighing more than 5grams and having densitygreater than 3.2 gm/cc

    Jet Fuel Dispensing Leak Fuel dispensing fire resulting inloss exceeding $2500

    Scoping reduces effort spent in the analysis by confining it to relevant considerations. Toscope, describe the level of penalty or the circumstances for which the event becomesintolerable use modifiers to narrow the event description.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ADDING CONTRIBUTORS TO THE TREE . . .

    18

    (3) AND, eachelement must be an

    immediatecontributor to the

    level above.

    (1) EachCONTRIBUTING

    ELEMENT

    (2) must be anINDEPENDENT* FAULTor FAILURE CONDITION(typically described by anoun, an action verb, andspecifying modifiers) EFFECT

    CAUSE

    NOTE: As a group under an AND gate, and individually under an OR gate,contributing elements must be both necessary and sufficient to serve asimmediate cause for the output event.

    independent ofall others. However, the samefault may appear at other pointson the tree.

    *At a givenlevel, undera givengate, eachfault mustbe

    EXAMPLES : Electrical Power

    Fails Off

    Low-Temp. Alarm

    Fails Off

    Solar q > 0.043BTU/ft2/sec

    Relay K-28Contacts FreezeClosed

    Transducer CaseRuptures

    Proc. Step 42Omitted

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EXAMPLEFAULT TREEDEVELOPMENT . . .

    19

    Constructing the logicSpotting / correcting somecommon errors

    Adding quantitative data

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • OVERSLEEP

    AN EXAMPLE FAULT TREE . . .

    20

    LATE FOR WORK

    SEQUENCEINITIATIONFAILURES TRANSPORTFAILURES LIFESUPPORTFAILURES PROCESS &MISC. SYSTEMMALFUNCTIONS

    UNDESIRABLEEVENT

    CAUSATIVEMODALITIES*

    *Partitioned aspects of system function, subdivided as topurpose, physical arrangement, or sequence of operation.?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SEQUENCE INITIATION FAILURES . . .

    21

    OVERSLEEP

    NO STARTPULSE

    BIO-RHYTHM

    FAILS

    ARTIFICIALWAKEUP FAILS

    NATURALAPATHY

    ?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • VERIFYING LOGIC . . .

    22

    ?

    OVERSLEEP

    NO STARTPULSE

    BIO-RHYTHM

    FAILS

    ARTIFICIALWAKEUP FAILS

    NATURALAPATHY

    Does thislook correct?

    Should thegate be

    OR?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • TEST LOGIC IN SUCCESS DOMAIN . . .

    23

    ?

    OVERSLEEP

    NO STARTPULSE

    BIO-RHYTHM

    FAILS

    ARTIFICIALWAKEUP FAILS

    NATURALAPATHY

    Redraw invert allstatements and gates.

    ?

    WAKEUPSUCCEEDS

    STARTPULSEWORKS

    BIO-RHYTHMWORKS

    ARTIFICIALWAKEUPWORKS

    NATURALHIGH

    TORQUE

    FailureDomain

    SuccessDomain

    If it was wrong here . . . . . . it'll be wrong here, too!

    motivationtrigger

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ARTIFICIALWAKEUPFAILS . . .

    24

    ARTIFICIALWAKEUPFAILSALARMCLOCKSFAIL

    MAIN,PLUG-INCLOCK FAILSPOWEROUTAGE FAULTYINNARDS

    ELECTRICALFAULT MECHANICALFAULTHOURHANDFALLSOFF

    HOURHANDJAMSWORKS

    FORGET TOSET

    BACKUP(WINDUP)CLOCK FAILSFAULTYMECHANISM FORGET TOSET FORGET TOWIND

    NOCTURNALDEAFNESS

    What does the treetell us about systemvulnerability at this

    point?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • BACKGROUNDFORNUMERICALMETHODS . . .

    25

    Relating PF to RThe Bathtub CurveExponential Failure DistributionPropagation through GatesPF Sources

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • RELIABILITY & FAILURE PROBABILITYRELATIONSHIPS . . .

    26

    1

    S = Successes

    F = Failures

    Reliability . . . . . . . . . R =

    Failure Probability . . . PF =

    R + PF =

    = Fault Rate =

    S(S + F)

    (S + F)F

    S(S + F) (S + F)

    F+

    MTBF1

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SIGNIFICANCE OF PF . . .

    27

    BURN ININFANT MORTALITY RANDOM FAILURE BURN OUT

    t

    00

    0

    T

    The Bathtub Curve

    Most system elements have constantfault rates ( ) over long periods of usefullife. During these periods, faults occurat random times.Fault probability is modeledacceptably well as a function ofexposure interval (T) by theexponential. For exposure intervalsthat are brief (T 0.2 MTBF), PF isapproximated within 2% by T. ePF = 1 -T

    PF T (within 2%, for T 20%)

    R = e -T

    1.0

    0 00.63

    1 MTBF T

    Exponentially Modeled Failure

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • PF = 1 RTPF = 1 (RARB)PF = 1 [(1 PA)(1 PB)]

    R & PF THROUGH GATES . . .

    28

    R + PF 1

    OR Gate AND GateEither of two, independent,element failures produces

    system failure.

    Both of two, independentelements must fail to produce

    system failure.RT = RA + RB RARBRT = RARB

    PF = 1 RTPF = 1 (RA + RB RARB)PF = 1 [(1 PA) + (1 PB) (1 PA)(1 PB)]

    PF = PAPBPF = PA + PB PAPBfor PA,B 0.2PF PA + PB

    with error 11%

    RareEvent

    Approximation

    PF = PAPBPCOmit forapproximation

    For 3 Inputs

    For 2 Inputs

    PF = PA + PB + PC PAPB PAPC PBPC

    + PAPBPC

    [Union / ] [Intersection / ]

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • Intersection /

    PF PROPAGATION THROUGH GATES . . .

    29

    PT = P

    1 + P

    2 P

    1P

    2P

    T = P

    1P

    2

    AND Gate

    TOP

    P1 P2

    PT = P

    1P

    2

    1 2

    OR Gate

    TOP

    P1

    P2

    PT P

    1 + P

    2

    1 2

    1 & 2are

    INDEPENDENTevents

    Usually negligible

    PT = Pe PT PeUnion /

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • The ip operator (C) is the co-function of pi ( ). It provides an exact solution for propagating probabilities through the OR gate. Its use is rarely justifiable.

    IPPING GIVES EXACT OR GATE SOLUTIONS . . .

    30

    PT = Pe = 1 P (1 Pe)PT = 1 [(1 P1) (1 P2) (1 P3) (1 Pn)]

    TOP

    P1 P2

    1 2 3

    P3

    PT = Pe

    TOP

    P1 P2

    1 2 3

    P3

    PT = ?

    TOP

    1 2 3

    PT = P (1 Pe)

    P1 = (1 P1)P2 = (1 P2)

    P3 = (1 P3)

    Failure FailureSuccess

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • MORE GATES AND SYMBOLS . . .

    31

    M

    INCLUSIVE OR GateP

    T = P

    1 + P

    2 (P

    1 x P

    2)Opens when any one or more events occur.

    EXCLUSIVE OR GateP

    T = P

    1 + P

    2 2(P

    1 x P

    2)Opens when any one (but only one) event occurs.

    MUTUALLY EXCLUSIVE OR GateP

    T = P

    1 + P

    2Opens when any one of two or more eventsoccur. All other events are then precluded.P

    T Pe

    For all OR Gate cases, theRare Event Approximationmay be used for small values of P e.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • STILL MORE GATES AND SYMBOLS . . .

    32

    PRIORITY AND GateP

    T = P

    1 x P

    2Opens when input events occurin predetermined sequence.INHIBIT GateOpens when (single) inputevent occurs in presenceof enabling condition.

    CONDITIONING EVENTApplies conditions orrestrictions to other symbols.UNDEVELOPED EVENTAn event not further developed.

    EXTERNAL EVENTAn event normally expectedto occur.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME FAILURE PROBABILITY SOURCES . . .Manufacturers DataIndustry Consensus StandardsMIL StandardsHistorical Evidence Same or Similar

    Systems

    Simulation / TestingDelphi EstimatesERDA Log Average Method

    33

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • LOG AVERAGE METHOD* . . .

    34

    *REFERENCE: Briscoe, Glen J.; Risk Management Guide; System Safety Development Center; SSDC-11; DOE 76-45/11; September 1982.

    0.10.0316+

    0.01 0.02 0.03 0.04 0.05 0.07PLLowerProbabilityBound

    102

    PUUpperProbabilityBound101

    Log P L + Log P U2 (2) + (1)2Log Average = Antilog = Antilog = 101.5 = 0.0316228

    If probability is not estimated easily, but upper and lower credible bounds can be judged Estimate upper and lower credible bounds of probability for the phenomenon in question.

    Average the logarithms of the upper and lower bounds.

    The antilogarithm of the average of the logarithms of the upper and lower bounds is lessthan the upper bound and greater than the lower bound by the same factor. Thus, it isgeometrically midway between the limits of estimation.

    Note that, for the example shown, the arithmetic average would be

    i.e., 5.5 times the lower bound and 0.55 times the upper bound.

    0.01 + 0.12 = 0.055

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • MORE FAILURE PROBABILITY SOURCES . . .WASH1400 (NUREG75/014); Reactor Safety Study AnAssessment of Accident Risks in US Commercial Nuclear PowerPlants; 1975IEEE Standard 500GovernmentIndustry Data Exchange Program (GIDEP)Rome Air Development Center TablesNUREG/CR1278; Handbook of Human Reliability Analysis withEmphasis on Nuclear Power Plant Applications; 1980NUREG0492; Fault Tree Handbook; (Table XI1); 1986many others, including numerous industryspecific proprietarylistings

    35

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrupprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • Semiconductor Diodes 0.10 1.0 10.0

    Transistors 0.10 3.0 12.0

    Microwave Diodes 3.0 10.0 22.0

    MIL-R-11 Resistors 0.0035 0.0048 0.16

    MIL-R-22097 Resistors29.0 41.0 80.0

    Rotary Electrical Motors0.60 5.0 500.0

    Connectors 0.01 0.10 10.0

    SOURCE: Willie Hammer, Handbook of System and Product Safety, Prentice Hall

    TYPICAL COMPONENT FAILURE RATES . . .

    36

    DEVICE FAILURES PER 106 HOURSMINIMUM AVERAGE MAXIMUM

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • TYPICAL HUMAN OPERATOR FAILURE RATES . . .

    37

    * Error of Omission/Item Embedded in Procedure3 x 10-3

    * Simple Arithmetic Error with Self-Checking 3 x 10-2

    * Inspector Error of Operator Oversight 10-1

    * General Rate/High Stress/Dangerous Activity0.2 - 0.3

    ** Checkoff Provision Improperly Used 0.1 - 0.9 (0.5 avg.)

    ** Error of Omission/10-Item Checkoff List 0.0001 - 0.005 (0.001 avg.)

    ** Carry out Plant Policy/No Check on Operator0.005 - 0.05 (0.01 avg.)

    ** Select Wrong Control/Group of Identical,

    Labeled, Controls 0.001 - 0.01 (0.003 avg.)

    SOURCES: * WASH-1400 (NUREG-75/014); Reactor Safety Study An Assessment of Accident

    Risks in U.S. Commercial Nuclear Power Plants, 1975

    ** NUREG/CR-1278; Handbook of Human Reliability Analysis with Emphasis on

    Nuclear Power Plant Applications, 1980

    ACTIVITY ERROR RATE

    Allen Long

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME FACTORS INFLUENCING HUMAN OPERATORFAILURE PROBABILITY . . .ExperienceStressTrainingIndividual self discipline / conscientiousnessFatiguePerception of Error Consequences (to self / others)Use of Guides and ChecklistsRealization of Failure on Prior AttemptCharacter of Task Complexity/Repetitiveness

    38

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ARTIFICIALWAKEUPFAILS . . .

    39

    Faults/Operation8. x 10-3

    Rate, Faults/Year2/1

    Assume 260 Operations/Year

    KEY:

    ARTIFICIALWAKEUPFAILS3.34 x 10-4ALARMCLOCKSFAIL

    3.34 x 10-4MAIN,PLUG-INCLOCK FAILS1.82 x 10

    -2

    POWEROUTAGE1. x 10-2

    FAULTYINNARDS3. x 10-4

    ELECTRICALFAULT3. x 10

    -4

    MECHANICALFAULT8. x 10-8

    HOURHANDFALLSOFF4. x 10-4

    HOURHANDJAMSWORKS2. x 10-4

    FORGET TOSET8. x 10-3

    BACKUP(WINDUP)CLOCK FAILS1.83 x 10

    -2

    FAULTYMECHANISM4. x 10-4

    FORGET TOSET8. x 10-3

    FORGET TOWIND1. x 10-2

    NOCTURNALDEAFNESSNegligible

    Approx. 0.1/yr

    3/12/11/102/1

    1/15

    1/10

    3/1

    1/20

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • HOW MUCH PT IS TOO MUCH . . . ?

    40

    Consider bootstrapping comparisons with known risks

    Browning, R. L., The Loss Rate Concept in Safety Engineering* National Safety Council, Accident Facts Kopecek, J. T., Analytical Methods Applicable to Risk Assessment & Prevention, Tenth

    International System Safety Conference

    102 - 103 / exp MH

    103 / exp hr

    104 / exp hr

    105 / exp hr

    106 / exp MH

    106 / exp MH

    107 - 108 / exp MH

    109 / exp MH*

    1010 / exp hr

    1014 / exp hr

    Human Operator Error (Responseto Repetitive Stimulus) Internal Combustion Engine Failure(Spark Ignition) Pneumatic Instrument Recorder Failure Distribution Transformer Failure U.S. Motor Vehicle Fatalities Death by Disease (U.S. Lifetime Avg.) U.S. Employment Fatalities Death by Lightning Meteorite (>1 lb) Hit on 10 3 x 103 ft areaof U.S. Earth Destroyed by Extraterrestrial Hit

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • APPLY SCOPING . . .

    41

    WHAT power outages are of concern?

    NOT ALL OF THEM!

    Only those that

    1. are undetected / uncompensated

    2. occur during the hours of sleep

    3. have sufficient duration to fault thesystem

    This probability must reflect these conditions!

    1. x 10-2

    POWEROUTAGE3/1

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SINGLEPOINT FAILURE . . .

    A Failure of one independentelement of a system which causesan immediate hazard to occurand/or causes the whole system tofail.

    42

    Professional Safety March 1980

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME AND GATE PROPERTIES . . .

    43

    TOP

    1 2

    PT = P1 x P2

    COST:Assume two identicalelements having P = 0.1.

    PT = 0.01Two elements having P = 0.1may cost much less than oneelement having P = 0.01.

    FREEDOM FROM SINGLE POINT FAILURE:Redundancy ensures that either 1 or 2 may failwithout inducing TOP.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • DON'T

    FAILURES AT ANY ANALYSIS LEVEL MUST BE . . .

    44

    ALARM FAILURE

    ALARMCLOCKFAILS

    BACKUPCLOCKFAILS

    ALARM FAILURE

    ALARMCLOCKFAILS

    TOASTBURNS

    BACKUPCLOCKFAILS

    FAULTYINNARDS

    ELECT.FAULT GEARING

    FAILS

    OTHERMECH.FAULT

    HANDFALLS/JAMS

    WORKS

    MECHANICALFAULT

    HANDFALLSOFF

    HANDJAMS

    WORKS

    INDEPENDENT OF EACH OTHER TRUE CONTRIBUTORS TO THE LEVEL ABOVE

    TRUE CONTRIBUTORS

    INDEPENDENT

    DO

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • COMMON CAUSE EVENTS / PHENOMENA . . .A Common Cause is an event or aphenomenon which, if it occurs, willinduce the occurrence of two ormore fault tree elements.

    45

    Oversight of Common Causes isa frequently found fault tree flaw!

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • COMMON CAUSE OVERSIGHT an example . . .

    46

    Four, wholly independent alarm systems are provided to detectand annunciate intrusion. No two of them share a commonoperating principle. Redundancy appears to be absolute. TheAND gate to the TOP event seems appropriate. BUTsupposethe four systems share a single source of operating power, andthat source fails, and there are no backup sources?

    UNANNUNCIATEDINTRUSION BY

    BURGLAR

    MICROWAVE ELECTRO-OPTICALDETECTOR / ALARM FAILURES

    SEISMICFOOTFALL ACOUSTIC

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • MICROWAVE ELECTRO-OPTICAL SEISMIC FOOTFALL ACOUSTIC

    BASIC POWER FAILURE EMERGENCY POWER FAILURE

    COMMON CAUSE OVERSIGHT CORRECTION . . .

    47

    UNANNUNCIATEDINTRUSION BYBURGLARDETECTOR/ALARMFAILURE DETECTOR/ALARMPOWER FAILURE

    Here, power source failure has been recognized as an eventwhich, if it occurs, will induce failure of all four alarm systems.Power failure has been accounted for as a Common Causeevent, leading to the TOP event through an OR gate. OTHERCOMMON CAUSES SHOULD ALSO BE SEARCHED FOR!

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EXAMPLE COMMON CAUSEFAULT/FAILURE SOURCES . . .

    Utility OutageElectricityCooling WaterPneumatic PressureSteametc

    MoistureCorrosionSeismic Disturbance

    48

    Dust/Grit

    Temperature

    Effects (Freezing/

    Overheat)

    Electromagnetic

    Disturbance

    Single Operator

    Oversight

    many others

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EXAMPLE COMMON CAUSESUPPRESSION METHODS . . .

    Separation / Isolation / Insulation / Sealing /Shielding of System Elements.

    Using redundant elements having differingoperating principles.

    Separately powering/servicing/maintainingredundant elements.

    Using independent operators/inspectors.

    49

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • MISSING ELEMENTS . . .?

    50

    Contributingelements must

    combine to satisfy allconditions essentialto the TOP event.The logic criteria of

    necessity andsufficiency must be

    satisfied!

    UNANNUNCIATEDINTRUSION BYBURGLARANNUNCIATIONFAILURE

    DETECTOR /ALARM SYSTEMFAILURE DETECTOR /ALARM POWERFAILUREINTRUSIONBYBURGLAR

    BURGLARPRESENT BARRIERSFAIL ACOUS SEISMIC FOO ELECTRO-OPTICAL MICROWAVE EMERGENCY POW BASIC POWER

    SYSTEMCHALLENGE

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EXAMPLE PROBLEM Sclerotic Scurvy . . .(The Astronaut's Scourge)

    51

    BACKGROUND:Sclerotic scurvy infects ten percent of all returning astronauts. Incubationperiod is 13 days. For a week thereafter, victims of the disease displaysymptoms which include malaise, lassitude, and a very crabby outlook. Atest can be used during the incubation period to determine whether anastronaut has been infected. Anti-toxin, administered during theincubation period, is 100% effective in preventing the disease whenadministered to an infected astronaut. However, for an uninfectedastronaut, it produces disorientation, confusion, and intensifies allundesirable personality traits for about seven days. The test for infectionproduces a false positive result in two percent of all uninfected astronautsand a false negative result in one percent of all infected astronauts. Bothtreatment of an uninfected astronaut and failure to treat an infectedastronaut constitute malpractice.

    PROBLEM:Using the test for infection and the anti-toxin, if the test indicates need forit, what is the probability that a returning astronaut will be a victim of mal-practice?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SCLEROTIC SCURVY MALPRACTICE . . .

    52

    1% of infected cases test falsely negative,receive no treatment, succumb to disease

    MALPRACTICEFAIL TO TREATINFECTION(DISEASE)

    INFECTEDASTRONAUTFALSENEGATIVETEST

    TREATNEEDLESSLY(SIDE EFFECTS)HEALTHYASTRONAUT FALSEPOSITIVETEST

    0.10.01 0.02

    0.001

    0.019

    0.018

    0.9

    2% of uninfected cases test falsely positive,receive treatment, succumb to side effects

    10% of returnees are infected 90% are not infected

    What is the greatestcontributor to this

    probability?

    Should the test beused?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUTSETS AIDS TO . . .

    53

    System DiagnosisReducing VulnerabilityLinking to Success Domain

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SETS . . .

    A CUT SETis any group of faulttree initiators which, if all occur, willcause the TOP event to occur.

    A MINIMAL CUT SETis a leastgroup of fault tree initiators which, ifall occur, will cause the TOP eventto occur.

    54

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FINDING CUT SETS . . .

    55

    1. Ignore all tree elements except the initiators (leaves / basics).2. Starting immediately below the TOP event, assign a unique letter to

    each gate, and assign a unique number to each initiator.3. Proceeding stepwise from the TOP event downward, construct a matrix

    using the letters and numbers. The letter representing the TOP eventgate becomes the initial matrix entry. As the construction progresses:

    Replace the letter for each AND gate by the letter(s)/number(s)for all gates/initiators which are its inputs. Display thesehorizontally, in matrix rows.

    Replace the letter for each OR gate by the letter(s)/number(s) forall gates/initiators which are its inputs. Display these vertically, inmatrix columns. Each newly formed OR gate replacement rowmust also contain all other entries found in the original parent row.

    4. A final matrix results, displaying only numbers representing initiators.Each row of this matrix is a Boolean Indicated Cut Set. By inspection,eliminate any row that contains all elements found in a lesser row. Alsoeliminate redundant elements within rows and rows that duplicate otherrows. The rows that remain are Minimal Cut Sets.

    The method for obtaining Cut Sets used here is based on the MOCUS computer algorithm,attributed to J. B. Fussell.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • A CUT SET EXAMPLE . . .

    56

    PROCEDURE:

    Assign letters togates. (TOP gateis A.) Do notrepeat letters.

    Assign numbersto basic initiators.If a basic appearsmore than once,represent it by thesame number ateach appearance.

    Construct a matrix, starting with the TOP A gate

    TOP

    C

    DB

    A

    1

    2 3

    42

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • A CUT SET EXAMPLE (cont) . . .

    57

    2 2 31 2

    1 42 4 3

    1 22 3

    1 4

    A B D

    C D

    1 D

    2 D 3

    1 D

    2 D 31 4

    1 2

    TOP event gateis A, the

    initial matrixentry.

    A is an ANDgate. B & D,its inputs, re-place it hori-

    zontally.

    B is an ORgate. 1 & C,its inputs, re-place it verti-cally. Each

    requires a newrow.

    C is an ANDgate. 2 & 3,its inputs,replace it

    horizontally.

    D (top row), isan OR gate.2 & 4, its in-puts, replaceit vertically.

    Each requires anew row.

    D (2nd row), isan OR gate.Replace as

    before.

    These Boolean-IndicatedCut Sets

    reduce tothese Minimal

    Cut Sets.

    Minimal Cut Setrows are least

    groups ofinitiators which

    will induce TOP.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • BooleanEquivalentFault Tree

    AN EQUIVALENT FAULT TREE . . .

    58

    TOP

    1 22 31 4

    An Equivalent FaultTree can be constructedfrom Minimal CutSets. For example,these Minimal CutSets

    represent this Fault Treeand this Fault Tree is a Logic Equivalent of theoriginal, for which the Minimal Cut Sets were derived.

    1 2 341 2

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EQUIVALENT TREES AREN'T ALWAYS SIMPLER . . .

    59

    This Fault Tree

    has this Logic Equivalent.

    TOP

    123 45 6

    123 45 6

    12

    3 45 612

    3 45 6

    TOP4 Gates

    6 Initiators

    1 2 3 4 5 61 / 3 / 5

    1 / 3 / 6

    1 / 4 / 5

    1 / 4 / 6

    2 / 3 / 5

    2 / 3 / 6

    2 / 4 / 5

    2 / 4 / 6

    MINIMAL CUT SETS 9 Gates24 Initiators

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ANOTHER CUT SET EXAMPLE . . .

    60

    Compare this caseto the first Cut Setexample notedifferences. TOPgate here is OR. Inthe first example,TOP gate was AND.

    Proceed as with first example

    TOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ANOTHER CUT SET EXAMPLE (cont) . . .

    61

    1 2

    F 6

    1 E

    1 2

    3 5 G 6

    1 E

    1 2

    3 5 4 6

    1 3

    1 4

    3 5 1 6

    3 4 5 6

    1 2

    1 31 4

    A B

    C

    1 D

    F 6

    1 2

    3 5 G 6

    1 3

    1 4

    Note that there are fourMinimal Cut Sets.

    Co-existence of all of theinitiators in any one of them

    will precipitate the TOP event.

    An Equivalent Fault Tree can again be constructed

    Construct matrix make step-by-step substitutions

    Boolean-Indicated Cut Sets

    Minimal Cut Sets

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ANOTHER EQUIVALENT FAULT TREE . . .

    62

    1 21 31 43 4 5 6

    These Minimal Cut Setsrepresent this Fault Tree a Logic Equivalent ofthe original tree.

    TOP

    1 2 1 3 1 4 3 4 5 6

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FROM TREE TO RELIABILITYBLOCK DIAGRAM . . .

    63

    TOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    TOP

    Blocks represent functions of system elements.Paths through them represent success.

    The tree models a system fault, in failure domain.Let that fault be System Fails to Function asIntended . Its opposite, System Succeeds toFunction as Intended , can be represented by aReliability Block Diagram in which success flowsthrough system element functions from left toright. Any path through the block diagram, notinterrupted by a fault of an element, results insystem success.

    2 133

    4 5614Barring terms denotes consid-eration of their success properties.(n)

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SETS AND RELIABILITY BLOCKS . . .

    64

    TOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 11 21 31 43 4 5 6Minimal Cut SetsEach Cut Set (horizontal rows in thematrix) interrupts all left-to-right pathsthrough the Reliability Block Diagram.

    Note that 3/5/1/6 is aCut Set, but not aMinimal Cut Set. (Itcontains 1/3, a trueMinimal Cut Set.)6

    2 133

    4 51 4

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SETUSES . . .

    65

    Evaluating PTFinding Vulnerability to Common Causes

    Analyzing Common Cause Probability

    Evaluating Structural Cut Set Importance

    Evaluating Quantitative Cut Set ImportanceEvaluating Item Importance

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SET USES / Evaluating P T . . .

    66

    PT Pk =P1 x P2 +P1 x P3 +P1 x P4 +P3 x P4 x P5 x P61 21 31 43 4 5 6

    Minimal Cut SetsTOP1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    PT

    Cut Set Probability (P k ), the product of probabilitiesfor events within the Cut Set, is the probability thatthe Cut Set being considered will induce TOP.Pk = Pe = P1 x P2 x P3 x . . . Pn 1 23 5 4 61 31 43 5 1 6Note that propagating probabilitiesthrough an unpruned tree, i.e.,using Boolean-indicated Cut Setsrather than Minimal Cut Sets,would produce a falsely high P T.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SET USES / Common Cause Vulnerability . . .

    67

    3m 4m

    Uniquely subscript Initiators, using letter indicators of Common Cause susceptibility, e.g.

    llll = location(code where)

    m = moistureh = human operatorq = heatf = coldv = vibrationetc.

    Some Initiators may be vulnerable to several Common Causes and receive several corresponding subscript designators. Some may have no Common Cause vulnerability receive no subscripts.

    TOP

    1v

    2h

    6m

    G

    F

    E

    D

    CB

    A

    5m

    PT

    1v

    3m

    4m

    All Initiators in this Cut Set are vulnerable to moisture. Mois ture is a Common Cause and can induce TOP.ADVICE: Moisture proof one or more items.

    1v

    2h1

    v3

    m

    1v

    4m

    3m

    4m

    5m

    6m

    Minimal Cut Sets

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • Thesemust beOR

    ANALYZING COMMON CAUSE PROBABILITY . . .

    68

    TOP PT Common-CauseInduced FaultSystemFaultMoisture Vibration HumanOperator Heat othersAnalyzeas usual

    Introduce each Common Causeidentified as a Cut Set Killerat its individual probability levelof both (1) occurring, and (2)inducing all terms within theaffected Cut Set.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SET STRUCTURAL IMPORTANCE . . .

    69

    TOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    1 21 31 43 4 5 6

    Minimal Cut SetsPTAll other things being equal

    a LONG Cut Set signalslow vulnerability

    a SHORT Cut Set signalshigher vulnerability

    presence of NUMEROUSCut Sets signals highvulnerability

    and a singlet Cut Set signals a Potential Single-Point Failure.Analyzing Structural Importance enables qualitative ranking of contributions to System Failure.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CUT SET QUANTITATIVE IMPORTANCE . . .

    70

    1 21 31 43 4 5 6

    Minimal Cut SetsTOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    PT The quantitative Importance of a Cut Set(Ik) is the numerical probability that,given that TOP has occured, that Cut Sethas induced it.where P

    k = P

    e= P

    3 x P

    4 x P

    5 x P

    6

    PT

    Ik =

    Pk

    To reduce system vulnerability most effectively, attack Cut Sets having greater Importance.Generally, short Cut Sets have greater Importance, long Cut Sets have lesser Importance.Analyzing Quantitative Importance enables numerical ranking of contributions to System Failure.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ITEM IMPORTANCE . . .

    71

    Ie I

    ke

    Ne

    1 21 31 43 4 5 6

    Minimal Cut Sets

    The quantitative Importance of an Item (Ie) is the

    numerical probability that, given that TOP hasoccured, that Item has contributed to it.

    Ne = Number of Minimal CutSets containing Item e

    Ike = Importance of the MinimalCut Sets containing Item e

    ExampleImportance of Item 1

    I1

    PT

    (P1 x P

    2 ) + (P

    1 x P

    3) + (P

    1 x P

    4)

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • 72

    PATHSETS AIDS TO . . .

    Further Diagnostic MeasuresLinking to Success DomainTrade / Cost Studies

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • PATH SETS . . .

    A PATH SETis a group of fault tree initiatorswhich, if none of them occurs, will guarantee thatthe TOP event cannot occur.

    TO FIND PATH SETS*change all AND gates toOR gates, and all OR gates to AND. Then proceedusing matrix construction as for cut sets. Pathsets will be the result.

    73

    * This Cut Set-to-Path Set conversion takes advantage of de Morgan's dualitytheorem. Path Sets are complements of Cut Sets.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • A PATH SETEXAMPLE . . .

    74

    1 21 31 43 4 5 6

    1 31 41 51 62 3 4

    ThisFaultTreehas

    these MinimalCut Sets

    and these PathSets.

    Path Sets areleast groups of

    initiators which, ifthey cannot

    occur, guaranteeagainst TOPoccurring.

    TOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    Barring terms denotes consid-eration of their success properties.(n)

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • PATH SETS AND RELIABILITY BLOCKS . . .

    75

    TOP

    1

    2

    3

    3

    4

    6

    G

    F

    E

    D

    CB

    A

    5

    4 1

    1 31 41 51 62 3 4Path Sets

    Each Path Set(horizontal rowsin the matrix)represents aleft-to-right paththrough theReliability BlockDiagram.6

    2 133

    4 51 4

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • PATH SETS AND TRADE STUDIES . . .

    76

    1 31 41 51 62 3 4

    Path Sets Sprinkle countermeasure resourcesamongst the Path Sets. Compute theprobability decrement for each newlyadjusted Path Set option. Pick thecountermeasure ensemble(s) givingthe most favorable PP/$. (Selectionresults can be verified by computingPT/$ for competing candidates.)

    PP P

    e

    Path Set probability (P P)is the probability thatthe system will suffer afault at one or morePP $

    $a$b$c$d$e

    PPaPPbPPcPPdPPe

    points along the operational routemodeled by the path. To minimizefailure probability, minimize Path Setprobability.a -b -c -d -e -

    62 13

    34 51 4

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • REDUCING VULNERABILITY A SUMMARY . . . Inspect tree find/operate on major PT contributors

    - Add intervenors / redundancy (lengthen cut sets).- Derate components (increase robustness / reduce P

    e).

    - Fortify maintenance / parts replacement (increase MTBF). Examine / alter system architecture increase path set/cut set ratio. Evaluate Cut Set Importance. Rank items using Ik.

    Identify items amenable to improvement. Evaluate Item Importance. Rank items using I

    e.

    Identify items amenable to improvement. Evaluate Path Set Probability.

    Reduce PP at most favorable D DDD P/D DDD $.

    77

    Ie @ @@@ S SSS IkeN

    e

    Ik = PTPk

    PP @ @@@ S SSS Pe

    For all new countermeasures, THINK COST EFFECTIVENESS FEASIBILITY (incl. schedule)AND

    Does the new countermeasure Introduce new HAZARDS? Cripple the System?

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME DIAGNOSTIC ANDANALYTICAL GIMMICKS . . .A Conceptual Probabilistic ModelSensitivity TestingFinding a P T Upper LimitLimit of Resolution Shutting off Tree GrowthStateofComponent MethodWhen to Use Another Technique FMECA

    78

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • SOME DIAGNOSTIC GIMMICKS . . .

    79

    TOP

    PT

    1 2 3 4 56 7 8 910 11 12 13 14 15

    16 17 18 19 20 2122 23 24 25 26 27 28 29

    30 31 32 33 34

    using a genericall-purpose fault tree

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • THINK ROULETTE WHEELS . . .

    80

    TOP

    PT

    1 2 3 4 56 910 11 12 13 14 15

    16 17 18 19 20 2122 23 24 25 26 27 28 29

    30 31 32 33 34

    A convenient, thought-toolmodel of probabilistic treemodeling7 Imagine a roulette wheel representing each initiator.The peg count ratio for each wheel is determinedby probability for that initiator. Spin all initiatorwheels once for each system exposure interval.Wheels winning in gate-opening combinationsprovide a path to the TOP.

    P22

    = 3 x 10-3

    1000 peg spaces997 white

    3 red

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • USE SENSITIVITY TESTS . . .

    81

    Gaging the nastinessofuntrustworthy initiatorsTOPPT

    1 2 3 4 56 7 8 910 11 12 13 14 15

    16 17 18 19 20 2122 23 24 25 26 27 28 29

    30 31 32 33 34

    Embedded within the tree, there's a bothersome initiatorwith an uncertain P e. Perform a crude sensitivity test toobtain quick relief from worry or , to justify the urgencyof need for more exact input data:1.) Compute P T for a nominal value of P e. Then,recompute P T for a new P e = Pe+ Pe.If this sensitivity exceeds 0.1 in a large tree, work to finda value for P e having less uncertainty or 2.) Compute P T for a value of P e at its upper credible limit.Is the corresponding P T acceptable? If not, get a better P e.

    P10 = ? now, compute the Sensitivity of P e = PePT

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FIND A MAX P T LIMIT QUICKLY . . .

    82

    TOP

    PT

    1 2 3 4 56 7 8 910 11 12 13 14 15

    16 17 18 19 20 2122 23 24 25 26 27 28 29

    30 31 32 33 34

    The parts-count approachgives a sometimes-usefulearly estimate of P T PT cannot exceed an upper bound given by:PT(max) = Pe = P1 + P2 + P3 + Pn

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ?HOW FAR DOWN SHOULD A FAULT TREE GROW . . . ?

    83

    TOP

    PT

    1 2 3 4 56 7 8 910 11 12 13 14 15

    16 17 18 19 20 21?

    Where do you stop the analysis? The analysis is a Risk Managemententerprise. The TOP statement gives severity . The tree analysis providesprobability . ANALYZE NO FURTHER DOWN THAN IS NECESSARY TOENTER PROBABILITY DATA WITH CONFIDENCE. Is risk acceptable? If YES,stop. If NO, use the tree to guide risk reduction. SOME EXCEPTIONS1.) An event within the tree has alarmingly high probability. Dig deeperbeneath it to find the source(s) of the high probability.2.) Mishap autopsies must sometimes analyze down to the cotter-pin level toproduce a credible cause list.

    Severity Probability

    Initiators / leaves / basics define theLIMIT OF RESOLUTIONof the analysis.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • STATEOFCOMPONENT METHOD . . .

    84

    RELAY K-28CONTACTS FAILCLOSEDBASICFAILURE/RELAYK-28 RELAYK-28COMMANDFAULT RELAYK-28 SECONDARYFAULT

    WHEN: Analysis has proceded to thedevice level i.e., valves, pumps,switches, relays, etc.

    HOW: Show device fault/failure in the mode

    needed for upward propagation. Install an OR gate. Place these 3 events beneath the OR.

    This represents internalself failures under normalenvironmental and servicestresses e.g., coilburnout, spring failure,contacts drop off

    Analyze further to find thesource of the fault condition,induced by presence/absence of externalcommand signals. (Omitfor most passive devices e.g., piping.)

    This represents faults fromenvironmental and servicestresses for which thedevice is not qualified e.g., component struck byforeign object, wrongcomponent selection/installation. (Omit, ifnegligible.)

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • EXECUTIVE SUMMARY (Abstract of complete report)SCOPE of the analysis

    Brief system descriptionTOP Description/Severity BoundingAnalysis Boundaries

    Physical Boundaries Interfaces TreatedOperational Boundaries Resolution LimitOperational Phases Exposure IntervalHuman Operator In/Out Others

    THE ANALYSISDiscussion of Method (Cite Refs.)Software UsedPresentation/Discussion of the TreeSource(s) of Probability Data (If quantified)Common Cause Search (If done)Sensitivity Test(s) (If conducted)Cut Sets (Structural and/or Quantitative Importance, if analyzed)Path Sets (If analyzed)Trade Studies (If done)

    FINDINGSTOP Probability (Give Confidence Limits)Comments on System Vulnerability

    Chief ContributorsCandidate Reduction Approaches (If appropriate)

    CONCLUSIONS AND RECOMMENDATIONS Risk Comparisons (Bootstrapping data, if appropriate)Is further analysis needed? by what method(s)?

    THE FAULT TREE ANALYSIS REPORT . . .

    85

    Say what is analyzedandwhat is not analyzed. T I T L ECOMP ANYA uthorDateetc.Show Tree as Figure.Include Data Sources,Cut Sets, Path Sets,etc. as Tables.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CHARACTERISTIC FTA FMECASELECTION PREFERRED

    FTA vs FMECA SELECTION CRITERIA* . . .

    86

    *Adapted from Fault Tree Analysis Application Guide, Reliability Analysis Center, Rome Air Development Center.

    Safety of public/operating/maintenance personnel 3

    Small number/clearly defined TOP events 3

    Indistinctly defined TOP events 3

    Full-Mission completion critically important 3

    Many, potentially successful missions possible 3

    All possible failure modes are of concern 3

    High potential for human error contributions 3

    High potential for software error contributions 3

    Numerical risk evaluation needed 3

    Very complex system architecture/many functional paths3

    Linear system architecture with little human/software influence3

    System irrepairable after mission starts 3

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FAULT TREE CONSTRAINTS ANDSHORTCOMINGS . . .Undesirable events must be foreseen and are only

    analyzed singly.

    All significant contributors to fault/failure must beanticipated.

    Each fault/failure initiator must be constrained totwo conditional modes when modeled in the tree.

    Initiators at a given analysis level beneath acommon gate must be independent of each other.

    Events/conditions at any analysis level must betrue, immediate contributors to nextlevel events/conditions.

    Each initiator's failure rate must be a predictableconstant.

    87

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • COMMON FAULT TREE ABUSES . . . Overanalysis Fault Kudzu Unjustified confidence in numerical results 6.0232 x 105 ? Credence in preposterously low probabilities1.66 x 1024/hour Unpreparedness to deal with results (particularly

    quantitative)Is 4.3 x 107/hour acceptable for a catastrophe? Overlooking common causesWill a roof leak or a shaking floor

    wipe you out? MisapplicationWould Event Tree Analysis (or another

    technique) serve better? Scoping changes in midtree

    88

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • FAULT TREE PAYOFFS . . .

    Gaging/quantifying system failure probability.Assessing system Common Cause vulnerability.Optimizing resource deployment to control

    vulnerability.

    Guiding system reconfiguration to reducevulnerability.

    Identifying Man Paths to disaster.Identifying potential single point failures.Supporting trade studies with differential analyses.

    89

    FAULT TREE ANALYSIS is a Risk Assessmententerprise. Risk Severity is defined by the TOP event.Risk Probability is the result of the tree analysis.

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CLOSING CAVEATS . . .

    Be wary of the ILLUSION of SAFETY! Low probabilitydoes not mean that a mishap won't happen!

    THERE IS NO ABSOLUTE SAFETY! An enterprise issafe only to the degree that its risks are tolerable!

    Apply broad confidence limits to probabilitiesrepresenting human performance!

    A large number of systems having low probabilities offailure means that A MISHAP WILL HAPPEN . . .somewhere among them!

    90

    P1 + P

    2 + P

    3 + P

    4 + - - - - - P

    n 1 more . . .

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • CAVEATS (conc.) . . .

    Don't drive the numbers into the ground!

    91

    Do you REALLY have enough data to justify QUANTITATIVE ANALYSIS?For 95% ConfidenceWe must have NO FAILURES in to give PF and R

    1000 tests 3 x 103 0.997

    300 tests 102 0.99

    100 tests 3 x 102 0.97

    30 tests 101 0.9

    10 tests 3 x 101 0.7

    ASSUMPTIONS: Stochastic System Behavoir Constant System Properties Constant Service Stresses Constant EnvironmentalStresses

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • ANALYZE ONLY TO TURN RESULTS INTODECISIONS . . .

    92

    Perform an analysis only to reach adecision. Do not perform an analysis if thatdecision can be reached without it. It is noteffective to do so. It is a waste ofresources.

    Dr. V. L. GroseGeorge Washington University

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • TOP

    P1 P2

    P3 >> P1 P2

    P3 > P1

    P2

  • EXPONENTIAL NOTATION

    94

    103 = 1000. 103 = = 0.0011

    1000Negative exponentindicates reciprocal.

    105 = 100000. 105 = = 0.000011

    100000

    Exponent showsdecimal location.

    CONVENTION

    6.28 x 103 = 6.28 x = = 0.006286.281000110006.28 x 103 = 6.28 x 1000 = 6280EXAMPLES

    MULTIPLICATION(multiply coefficients, and add exponents)

    DIVISION(divide coefficients, and subtract exponents)

    (2.3 x 105) x (4.5 x 102) = 10.35 x 103 = 1.035 x 104

    (2.3 x 105) x (4.5 x 102) = 10.35 x 103 = 1.035 x 102

    (2.3 x 105) x (4.5 x 102) = 10.35 x 107 = 1.035 x 106

    RULE EXAMPLES

    RULE

    (2.3 x 105) x (4.5 x 102) = 10.35 x 107 = 1.035 x 108

    10.35 x 107

    (4.5 x 102)= (2.3 x 105) 10.35 x 10

    3

    (4.5 x 102)= (2.3 x 105)

    EXAMPLES

    EXAMPLESRULERAISING TO POWERS/EXTRACTING ROOTS(multiply/divide exponents)

    (2 x 102)3 = 8 x 106 (9 x 108) = 3 x 104

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • USEFUL MATHEMATICAL OPERATORS

    95

    A = ae = a1 + a2 + a3 + ane = 1

    e = n

    A = ae = a1 x a2 x a3 x an e = n

    e = 1

    A = ae = 1 [(1a1)x(1a2)x (1a3)x (1an)]

    e = 1

    e = n

    SIMPLE SUM

    SIMPLE PRODUCT

    COFUNCTION SUM

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

  • 96

    REDUCING SYSTEM VULNERABILITY

    Fortify maintenanceDerate componentsAlter architecture

    Relocate existing componentsAdd redundancyLoosen coupling / make system forgiving

    Suppress common causesReduce stresses

    Service stressesEnvironmental stresses

    Allen Longprovided as a free service by www.fault-tree.net -- Pat L. Clemens and Jacobs Sverdrup

    Fault Tree AnalysisTopics CoveredFirst - A Bit of BackgroundOriginsThe Fault Tree IsFault Tree Analysis is Best Applied to Cases WithFault Tree Analysis ProducesSome DefinitionsAssumptions and LimitationsThe Logic SymbolsSteps in Fault Tree AnalysisSome Rules and ConventionsMore Rules and Conventions

    Some Conventions IllustratedIdentifying Top EventsExample Top Events"Scope" The Tree TopAdding Contributors to the TreeExample Fault Tree DevelopmentAn Example Fault TreeSequence Initiation FailuresVerifying LogicTest Logic in Success DomainArtificial Wakeup FailsBackground for Numerical MethodsReliability & Failure Probability RelationshipsSignificance of PfR & Pf Through GatesPf Propogation Through Gates"Ipping" Gives Exact or Gate SolutionsMore Gates and SymbolsStill More Gates and Symbols

    Some Failure Probability SourcesLog Average MethodMore Failure Probability SourcesTypical Component Failure RatesTypical Human Operator Failure RatesSome Factors Influencing Human Operator Failure ProbabilityArtificial Wakeup FailsHow Much Pt is Too Much?Apply ScopingSingle-Point FailureSome "and" Gate PropertiesFailures at Any Analysis Level Must BeCommon Cause Events/PhenomenaCommon Cause OversightCommon Cause Oversight CorrectionExample Common Cause Fault/Failure SourcesExample Common Cause Suppression MethodsMissing Elements?Example Problem - Sclerotic ScurvySclerotic Scurby Malpractice

    Cut SetsCut Sets

    Finding Cut SetsA Cut Set ExampleA Cut Set Example (cont)

    An "Equivalent" Fault TreeEquivalent Trees Aren't Always SimplerAnother Cut Set ExampleAnother Cut Set Example (cont)

    Another "Equivalent" Fault TreeFrom Tree to Reliability Block DiagramCut Sets and Reliability BlocksCut Set UsesCut Set Uses/Evaluating PtCut Set Uses/Common Cause VulnerabilityAnalyzing Common Cause ProbabilityCut Set Structural "Importance"Cut Set Quantitative "Importance"Item "Importance"Path SetsPath Sets

    A Path Set ExamplePath Sets and Reliability BlocksPath Sets and Trade StudiesReducing Vulnerability - A SummarySome Diagnostic and Analytical GimmicksSome Diagnostic GimmicksThink "Roulette Wheels"Use Sensitivity TestsFind a Max Pt Limit QuicklyHow Far Down Should a Fault Tree GrowState-of-Component MethodThe Fault Tree Analysis ReportFTA vs FMECA Selection CriteriaFault Tree Constraints and ShortcomingsCommon Fault Tree AbusesFault Tree PayoffsClosing CaveatsCaveats (conc)

    Analyze Only to Turn Results Into DecisionsUseful Numerical SimplificationsExponential NotationUseful Mathematical OperatorsReducing System Vulnerability