root cause analysis - kepner-tregoe

34
Leaders in Problem Solving Root Cause Analysis Michael W. Curran-Hays, Principal John Ager, Consultant The Critical Thinking and Tools that Support It

Upload: others

Post on 29-Oct-2021

27 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Root Cause Analysis - Kepner-Tregoe

Leaders in Problem Solving

Root Cause Analysis

Michael W. Curran-Hays, Principal

John Ager, Consultant

The Critical Thinking and Tools that Support It

Page 2: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

“If you can't describe what you are doing as a process, you don't know what you're doing.” ― W. Edwards Deming

Page 3: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

8D What is the problem or change?What do we need to know to resolve it?

What did change to cause a change in performance?

What should change to address a change

in expectations?

What could change that would jeopardize our chosen actions or

plans?

What could change that would enhance our chosen actions

or plans?

D6

D3

YD7

D2D1D0

Deviation(s)

Defined

Describe the

Problem

Implement

Containment

Actions

Select and

Verify

Corrective

Actions

Prevent

Recurrence

Implement

Permanent

Corrective

Actions

Assemble a

Team

Process Key Situation Appraisal Problem AnalysisPotential Problem

AnalysisDecision Analysis

Identify

Opportunities

for Continuous

Improvement

Recognize

Team Efforts

Identify and

Verify Root

Causes

D5 D8

Potential Opportunity

Analysis

D4

Performance System

What is the problem or change?What do we need to know to resolve it?

What did change to cause a change in performance?

What should change to address a change

in expectations?

What could change that would jeopardize our chosen actions or

plans?

What could change that would enhance our chosen actions

or plans?

Page 4: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

RCA Tools

8D StepsDescribe the

ProblemIdentify and Verify Root Cause

ROOT CAUSE ANALYSIS

PRINCIPLES

State

the

Proble

m

Describe

the

Problem

Identify Possible

Causes

Evaluate Possible

Causes

Confirm

True

Cause

INFORMATION NEEDEDFACT

SFACTS HYPOTHESES

FACTS &

ASSUMPTIONSFACTS

RCA TOOLS

Problem Statement X

5 Whys X

Cause & Effect Chart/ Fault Tree X X

IS and IS NOTs X

Distinctions & Changes X X

Cause & Effect Chart/ Fault Tree X

Fishbone/ Ishikawa X

FMEA/ PPA X

Evaluate Possible Causes X

Confirm True Cause X

8D/A3 X X X X X

Page 5: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What two RCA tools do you use the most?

1. 5 Whys2. Fishbone3. KT Is/Is Not4. Fault tree5. Cause & Effect Charting6. Other (Please record in the question box from the Q&A tab)

5

Page 6: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Midges were flying into

the walls of the

Jefferson Memorial

Chunks of cement falling from the

Jefferson Memorial

Soap and water mixing

with the cement and

creating a destructive

acid

Monument is washed

every day

There were an

unusually large number

of bird droppings on

the memorial

There were an

unusually large number

of sparrows

The sparrows were

feasting on a plentiful

supply of spiders

The spiders were

feasting on an

abundance of midges

Jefferson Memorial lights turned on at

dusk

What is the Problem?

The 5 Whys

Photograph from Wikipedia, the free encyclopedia

Page 7: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the Problem?

Cause & Effect Charting

Page 8: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the Problem?

The Performance System

Eighty-five percent of the reasons for failure are deficiencies in the systems and process rather than the employee. The role of management is to change the process rather than badgering individuals to do better.W. Edwards Deming

Put a good person in a bad system and the bad system wins, no contest.W. Edwards Deming

Page 9: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the Problem?

The Performance System

Page 10: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Which elements of the Performance System do you most frequently consider as part of your investigations?

1. Situation – Expectations (How clear and well understood?)2. Situation – Signal to Perform (How easy to recognize when to perform?)3. Situation – Inputs (What tools, information, processes support the work?)4. Situation - Priority (How are responsibilities prioritized?)5. Performer’s Capability (What skills do they have?)6. Consequences (What’s In It For Me? From the perspective of the performer)7. Feedback (How do they know how well they are doing?)8. Other ( Please record in the question box from the Q&A tab)

10

Page 11: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

“What is the problem?”

"Okay, Houston, we've had a problem here" -Jack Swigert, Apollo 13

Page 12: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

8D What is the problem or change?What do we need to know to resolve it?

What did change to cause a loss of oxygen and power?

What should change to restore oxygen

and power and bring the astronauts

home?

What could change on the way home that would jeopardize their safe

arrival?

What could change on the way home

that would enhance their safe arrival?

D6

D3

YD7

D2D1D0

Deviation(s)

Defined

Describe the

Problem

Implement

Containment

Actions

Select and

Verify

Corrective

Actions

Prevent

Recurrence

Implement

Permanent

Corrective

Actions

Assemble a

Team

Process Key Situation Appraisal Problem AnalysisPotential Problem

AnalysisDecision Analysis

Identify

Opportunities

for Continuous

Improvement

Recognize

Team Efforts

Identify and

Verify Root

Causes

D5 D8

Potential Opportunity

Analysis

D4

Performance System

What is the problem or change?What do we need to know to resolve it?

What did change to cause a loss of oxygen and power?

What should change to restore oxygen

and power and bring the astronauts

home?

What could change on the way home that would jeopardize their safe

arrival?

What could change on the way home

that would enhance their safe arrival?

Page 13: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the Problem?

Cause & Effect Chart/FTA of post-Launch events

Page 14: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Describe the Problem

IS and IS NOT

“Two oxygen tanks essentially identical to oxygen tank no. 2 on Apollo

13, and two hydrogen tanks of similar design, operated satisfactorily on

several unmanned Apollo flights and on the Apollo 7, 8, 9, 10, 11, and

12 manned missions. With this in mind, the Board placed particular

emphasis on each difference in the history of oxygen tank no. 2 from

the history of the earlier tanks, in addition to reviewing the design,

assembly, and test history.” – Report of Apollo 13 Review Board

Page 15: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Describe Problem

IS and IS NOTIS IS NOT

WHAT

Object

Deviation

Service Module cryogenic oxygen

tank no. 2 - # 10024XTA0008

Service Module cryogenic oxygen tank

no. 1 - # 10024XTA0009

Hydrogen Tanks

Apollo 13 Apollo 7, 8, 9, 10, 11, and 12

Rapid expulsion of high-pressure

oxygen

Slow leak

WHERE In lunar orbit At Kennedy Space Center

WHEN

First

In the Life

Cycle

55:54:53.555

16 Mar 1970 – 11 Apr 1970

00:00:00:000 - 55:54:53.554

During 4th tank stir (fan)

Immediately after pressure and

temperature within oxygen tank

no. 2 rose abnormally

During Countdown Demonstration Test

During 1st – 3rd tank stirs (fan)

Page 16: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the problem or change?What do we need to know to resolve it?

What did change to cause a loss of oxygen and power?

What should change to restore oxygen

and power and bring the astronauts

home?

What could change on the way home that would jeopardize their safe

arrival?

What could change on the way home

that would enhance their safe arrival?

D6

D3

YD7

D2D1D0

Deviation(s)

Defined

Describe the

Problem

Implement

Containment

Actions

Select and

Verify

Corrective

Actions

Prevent

Recurrence

Implement

Permanent

Corrective

Actions

Assemble a

Team

Process Key Situation Appraisal Problem AnalysisPotential Problem

AnalysisDecision Analysis

Identify

Opportunities

for Continuous

Improvement

Recognize

Team Efforts

Identify and

Verify Root

Causes

D5 D8

Potential Opportunity

Analysis

D4

Performance System

What is the problem or change?What do we need to know to resolve it?

What did change to cause a loss of oxygen and power?

What should change to restore oxygen

and power and bring the astronauts

home?

What could change on the way home that would jeopardize their safe

arrival?

What could change on the way home

that would enhance their safe arrival?

8D

Page 17: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Identify Possible Causes

Ishikawa/Fishbone Diagram

Management controls not

defined in as great detail

Personnel don t know how

to install oxygen tanks

Teflon in oxygen tank

Probing with a hand tool in

manufacturing

Tank corroded Personnel don t know

how to test oxygen tanks

"shelf drop" in prime

contractor's plantFan Short Circuits

Tank released

oxygen

Assembly of equipment

essentially "blind"

Filter line ruptures

Debris strikes module

Insulation degraded

Valve fails to openTank assembled with

loose fill tube parts

Personnel not able to

install oxygen tanks

Page 18: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Identify Possible CausesCause & Effect Chart/Fault Tree Analysis

“Beginning from the defined undesired event, ‘Fuel cell power not available on SM buses’, the causative factors have been shown by means of logic diagramming. Given that a specified event can occur, all possible causes for that event are arrayed under it. It is important to note that this listing includes all possible ways in which the event can occur. Next, the relationship of these causative factors to one another and to the ultimate event is evaluated and determination as to whether the defined causes are mutually independent, or are required to coexist, is made.” – Report of Apollo 13 Review Board: Appendix F –Special Tests and Analyses

Actual NASA Fault Tree Analysis

Dated June 5th, 1970

Page 19: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

The Actual and Complete NASA Fault Tree Analysis

As shown on the Apollo 13 Review Board Report

Dated 1970

Page 20: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Identify Possible Causes - Distinctions and Changes

IS IS NOT Distinctions Changes

WHAT

Object

Deviation

SM cryogenic oxygen tank

no. 2 - # 10024XTA0008

SM cryogenic oxygen tank

no. 1 - # 10024XTA0009

Detanked

using 65 v

heater

Temperature

inside tank

@1000⁰ F

Hydrogen Tanks

Apollo 13 Apollo 7, 8, 9, 10, 11, and 12

Rapid expulsion of high-

pressure oxygen

Slow leak

WHERE In lunar orbit At Kennedy Space Center

WHEN

First

In the

Life

Cycle

55:54:53.555

16 Mar 1970 – 11 Apr 1970

00:00:00:000 - 55:54:53.554

During 4th tank stir (fan)

Immediately after pressure

and temperature within

oxygen tank no. 2 rose

abnormally

During Countdown

Demonstration Test

During 1st – 3rd tank stirs

(fan)

Page 21: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

8D

Page 22: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Evaluate Possible Causes

f. Data indicate that in flight the tank heaters located in oxygen tanks no. 1 and no. 2 operated normally prior to the accident, and they were not on at the time of the accident.

g. The electrical circuit for the quantity probe would generate only about 7 millijoules in the event of a short circuit and the temperature sensor wires less than 3 millijoules per second.

h. Telemetry data immediately prior to the accident indicate electrical disturbances of a character which would be caused by short circuits accompanied by electrical arcs in the fan motor or its leads in oxygen tank no. 2.

– Report of Apollo 13 Review Board

Page 23: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

8D

Page 24: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Confirm True Cause

l. As shown by subsequent tests, failure of the thermostatic switches probably permitted the temperature of the heater tube assembly to reach about 1000⁰ F in spots during the continuous 8-hour period of heater operation. Such heating has been shown by tests to severely damage the Teflon insulation on the fan motor wires in the vicinity of the heater assembly. From that time on, including pad occupancy, the oxygen tank no. 2 was in a hazardous condition when filled with oxygen and electrically powered.

m. It was not until nearly 56 hours into the mission, however, that the fan motor wiring, possibly moved by the fan stirring, short circuited and ignited its insulation by means of an electric arc. The resulting combustion in the oxygen tank probably overheated and failed the wiring conduit where it enters the tank, and possibly a portion of the tank itself.

– Report of Apollo 13 Review Board

Page 25: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Root Cause Analysis

Page 26: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the problem?Cause & Effect Chart / FTA of pre-launch events

It was found that the accident was not the

result of a chance malfunction in a statistical

sense, but rather resulted from an unusual

combination of mistakes, coupled with a

somewhat deficient and unforgiving design.

- Report of Apollo 13 Review Board

Page 27: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.27

What is the problem?

Cause & Effect Chart / FTA of pre-launch events

Page 28: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the problem?

The Performance System

Page 29: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What is the Problem?

The Performance System

9. The Manned Spacecraft Center should reassess all Apollo spacecraft subsystems, and

the engineering organizations responsible for them at MSC and at its prime contractors, to

insure adequate understanding and control of the engineering and manufacturing details

of these subsystems at the subcontractor and vendor level. Where necessary,

organizational elements should be strengthened and in-depth reviews conducted on

selected subsystems with emphasis on soundness of design, quality of manufacturing,

adequacy of test, and operational experience.

- Report of Apollo 13 Review Board

Page 30: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

What do we need to know?

What is the problem or

change we need to

understand and resolve?

What do we need to know?

What is the problem or

change we need to

understand and resolve?

What could change that

would jeopardize

preventing the oxygen

tank to be heated to

1000⁰?

What did change to

allow the oxygen tank to

be heated to 1000⁰?

What should change to

prevent the oxygen tank

to be heated to 1000⁰?

What could change that

would jeopardize

preventing the oxygen

tank to be heated to

1000⁰?

What should change to

prevent the oxygen tank

to be heated to 1000⁰?

What could change would

enhance preventing the

oxygen tank to be

heated to 1000⁰?

What did change to

allow the oxygen tank to

be heated to 1000⁰?

What could change would

enhance preventing the

oxygen tank to be

heated to 1000⁰?

What did change to

allow the oxygen tank to

be heated to 1000⁰?

Copyright 2017 Kepner-Tregoe, Inc. All Rights Reserved..

Page 31: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Understanding variation is the key to success in quality and business. - W. Edwards Deming

Confusing common causes with special causes will only make things worse. - W. Edwards Deming

Page 32: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

Cause of Gap Change in Performance Change in Expectations

Response Needed Corrective and Preventive Actions Product & Process Improvements

Constructs KT 8D A3 DMAIC

What do we need to

know?

D0: Deviation Defined

D1: Assemble a TeamBackground

Define – Understand Customer

Measure – Understand Process

What did change? D2: Describe the ProblemProblem

Statement

What should change? D3: Implement Containment Actions Goal Statement

What did change?D4: Identify, and Verify Root

Causes…

Root Cause

Analysis

What should change?D5: Select and Verify Corrective

Actions

Recom-

mendations

Analyze – Identify Improvement

Opportunities

What could change?D6: Implement Corrective Actions

D7: Prevent Recurrence

Implementation

PlanImprove – Achieve Improvement

What could change?

D8: Identify Opportunities for

Continuous Improvement

D8: Congratulate Your Team

Follow Up

ActionsControl – Sustain Improvements

Kepner-Tregoe Critical Thinking Processes and Change Management

Page 33: Root Cause Analysis - Kepner-Tregoe

Copyright © 2019 Kepner-Tregoe, Inc. All Rights Reserved.

A3

Page 34: Root Cause Analysis - Kepner-Tregoe

Leaders in Problem Solving

twitter.com@kepnertregoe

facebook.com/KepnerTregoe

linkedin.com/company/kepner-tregoe

John Ager Consultant

[email protected]

Michael Curran-HaysPrincipal

[email protected]