final

57
LSCITS SUMMER INTERNSHIP 2012 - ACCIDENT MODELING ROSS APTED

Upload: stargate1280

Post on 13-May-2015

699 views

Category:

Business


0 download

TRANSCRIPT

Page 1: Final

LSCITS SUMMER INTERNSHIP 2012 - ACCIDENT MODELING ROSS APTED

Page 2: Final

AIM

To compare existing methods for modeling (and/or predicting) failure in real world, complex systems.

Research and summarize an number off failures and accidents involving complex socio- technical

systems.

Research and summarize several modeling approaches

Apply selected modeling approaches to chosen accident

Page 3: Final

BREAKDOWN

3 Weeks: Researching failures and accidents.

- Columbia and Challenger Disasters

- 2010 Flash Crash

- Aviation accidents and incidents

4 Weeks: Researching ancient modeling approaches

5 Weeks: Modeled a well documented accident using various systemic and sequential accident

modeling techniques.

Page 4: Final

Above image shows the Columbia disintegrating over Texas

FAILURES AND ACCIDENTS

Page 5: Final

CRITERIA FOR SELECTION

The accident was well documented

Widely discussed in academic literature.

Page 6: Final

SPACE SHUTTLE COLUMBIA DISASTER

On the 1st February 2003 A critical systems failure occurred on the space shuttle Columbia (STS - 107) on its re – entry to the earth’s atmosphere.

This caused the disintegration of the shuttle leading to the death of all seven crew members.

STS-107 flight insignia

Page 7: Final

FLASH CRASH 2010

At approximately 2:45 on 6th May 2010 Prices on the United States stock market fell sharply only to recover minutes later.

The Dow Jones dropped 600 points during the crash adding to 300 point drop that day(due to Greek debt crisis).

Most of the 600 point drop was recovered within tens of minutes.

Dow Jones – Important Index of the stock of 30 large companies that are representative of the United states economy. Represents state of market.

(The staffs of the U.S. Commodity Futures Trading Commission and the U.S. Securities and Exchange Commission. )

Page 8: Final

KEGWORTH AIR DISASTER

On 8th January 1989 British midland flight 92 crashed while undertaking an emergency landing.

Crashed site : M1 embankment near the village of Kegworth.

The Boeing 737 -400 aircraft was severely damaged 79 of the of the 126 people aboard the plane survived.

An investigation was carried out by the Air Accidents Investigation Branch (AAIB). (Air Accidents Investigation Branch, 1989)

Page 9: Final

EVENTS OF CRASH

1. Moments after reaching cruising attitude fan blade broke off causing decrease in power and increase in vibrations. This caused the left engine to produced a jet of flames.

2. Smoke flooded into the cabin. Captain shut down the engine on the right.

3. Smell of smoke and vibrations reduced.

4. Crew diverted to Midlands Airport. Left engine completely failed during the descent of the emergency landing

Page 10: Final

Right engine was shut down

In adequate training

Left engine failed

Insufficient knowledge of aircraft

Crash

Improper design testing

Page 11: Final

CONTRIBUTING FACTORS

Inadequate knowledge of the aircraft

Flight crew observed smoke in the cabin.

Believed they could not trust the Vibration sensors. Was true of the old Boeing 737 but not the new 737-400.

Indicates the state of the engines.

Fell back on general knowledge of aircraft which was wrong. Thought that bleed air(pressure and heating) was taken from the right engine.

In fact the air conditioning systems utilized both engines in the new model.

Page 12: Final

BOEING 737 (OLD)

Key - bleed air via air

conditioning

Right engine

Page 13: Final

BOEING 737-400 (NEW)

Key - bleed air via air

conditioning

Right engine Left engine

Page 14: Final

CONTRIBUTING FACTORS In adequate training

The combination of violent engine vibrations and the smell of smoke while climbing to covered attitude was not covered in training.

Two separate protocols existed for each event but not in conjunction.

No simulation training for engine failure of this kind, or what to do if the situations fall out of bounds of standard procedures.

Differences in the Boeing 737 and 737-400 were not adequately taught.

Page 15: Final

WHY THE MISTAKE WAS NOT FOUND

By chance the the smoke dissipated and the vibrations reduced – this was actually due to standard procedure reducing fuel flow to both the engines.

Pilots did not communicate with the cabin crew who had visual confirmation of which engine was damaged.

Immediate division to Midlands airport create a high cabin workload this resulted in incorrect review procedure after the right engine was shut down.

Page 16: Final

SELECTED METHODS FOR ACCIDENT INVESTIGATION

Page 17: Final

TYPES OF ACCIDENT MODEL

Main types of Accident model. (Hollnagel, 2002)

Sequential

Epidemiological

Systemic

Page 18: Final

SEQUENTIAL ACCIDENT MODELS

Simplest form of accident modeling.

Describes the accident as a series of events that occur in a particular order.

Events occur along a linear timeline.

Analysis: Identifies specific cause and broken links in accident chain. Goal is to eliminate broken links.

Fault tree analysis, Domino Model of accident causation, Events and causal factors charting Event tree analysis, Management and Oversight Risk Tree (MORT), Sequential Timed Events Plotting (STEP),Man, Technology and Organization (MTO)-analysis, TRIPOD

Page 19: Final

SEQUENTIAL ACCIDENT MODELS SUMMARY

Advantages:

Human readable, easy to communicate chain of events.

Can identify root cause or break in chain of events that lead to accident.

Good starting of point.

Disadvantages:

Does not take into account latent factors.

Inadequate to model the variability of Sociotechnical systems.

Page 20: Final

EPIDEMIOLOGICAL ACCIDENT MODEL SUMMARY

Accident is described as a disease.

Some factor that effects the accident occur right away while others are latent.

Takes into account that events can manifest over time

Swiss cheese Model (Reason, 1997)

Page 21: Final

EPIDEMIOLOGICAL ACCIDENT MODEL SUMMARY

Overcome Limitations:

Superior to sequential models as latent events can be taken into account.

More suited to modeling complex systems.

Lack of detail:

Allowed the idefaction of general events that occurred could not go deeper.

Page 22: Final

SYSTEMIC ACCIDENT MODEL SUMMARY

Accidents naturally emerge, they are expected to occur. As detailed In Perrow’s Normal Accidents.(Perrow, 1984)

Focus:

Systemic models focus on the characteristics of a systems as oppose to a series of events that cause the accident in the system.

Difficult but powerful:

Ideal for complex systems but hard to represent graphically.

Page 23: Final

SYSTEMIC ACCIDENT MODEL SUMMARY

Considers the performance of the system as a whole.

Organization

Environmental

Human

Technical

System is view as many components interacting causing a equilibrium.

Systemic can evolve dynamically

Flawed interactions between components could cause system to be thrown out of balance

Accident

Page 24: Final

SYSTEMIC ACCIDENT MODEL SUMMARY

Cognitive Reliability Error Analysis Method (CREAM) (Hollnagel E. , Cognitive Reliability and Error Analysis Method., 1998)

The Functional Resonance Analysis Method (FRAM)(Hollnagel E. , FRAM – The Functional Resonance Analysis Method, 2012)

AcciMap(Rasmussen, 1997)

Systems-Theoretic Accident Model and Processes (STAMP) (Leveson, 2004)

Page 25: Final

APPLY SELECTED MODELING APPROACHES TO CHOSEN ACCIDENT

Page 26: Final

FAULT TREE ANALYSIS

Graphical representation of normal events, system failures, human errors and environmental factors.

Logic gate are used to construct chains of events.

Used to identify sequences off failure.

Identifies root cause.

(Høyland & Rausand, 1994)

Page 27: Final

Engine failure

High power

setting in flight

Fan blade fracture

Heavy vibrationsMetal fatigue

Heavy vibrations Flawed engine design

In adequate testing in

high

Engine vibrations sensor failure

Wrong engine shutdown (right engine)

No protocols in

place to deal with

simultaneous symptom of vibration and smoke.

Equipment failureJudgment error

Insufficient protocols Poor aircraft design

No way to get visual

conformation from cockpit

Inadequate

maintenance

Inadequate training

Pilots did not know that the aircraft had a different air-

condition

system

Other tasks

Pilots did no re-evaluate engine

switch of decision due to high cabin work

load.

British Midland Flight BD 92 crash landing

Page 28: Final

Engine failure

High power setting in

flight

Fan blade fracture

Heavy vibrationsMetal fatigue

Heavy vibrations Flawed engine design

In adequate testing in

high

Page 29: Final

Engine vibrations sensor failure

Wrong engine shutdown (right engine)

No protocols in place to deal

with simultaneous symptom of vibration and

smoke.

Equipment failureJudgment error

Insufficient protocols Poor aircraft design

No way to get visual

conformation from cockpit

Inadequate maintenance

Inadequate training

Pilots did not know that the aircraft had a different air-

condition system

Other tasks

Pilots did no re-evaluate

engine switch of decision due to high cabin work load.

Page 30: Final

Engine vibrations sensor failure

Wrong engine shutdown (right engine)

No protocols in place to deal

with simultaneous symptom of vibration and

smoke.

Equipment failureJudgment error

Insufficient protocols Poor aircraft design

No way to get visual

conformation from cockpit

Inadequate maintenance

Inadequate training

Pilots did not know that the aircraft had a different air-

condition system

Other tasks

Pilots did no re-evaluate

engine switch of decision due to high cabin work load.

Page 31: Final

ADVANTAGES AND DISADVATGES

Advantages:

Root cause can be easily be identified.

Human readable easy to communicate events that lead to accident.

Disadvantages:

Does not take into account latent conditions.

Does not take into account the environment in which the ancient occurred

Page 32: Final

CREAM - COGNITIVE RELIABILITY AND ERROR ANALYSIS METHOD  

Background:

Developed by Erik Hollnagel in 1998

Cognitive system engineering approach

design of human-machine systems accounting for factors of the environment in which the system exists.

Key idea:

Cognitive modeling of human performance for accident analysis or performance predictions

(Hollnagel E. , Cognitive Reliability and Error Analysis Method., 1998)

Page 33: Final

HOW CAN IT BE USED

CREAM is a bi – directional analysis method.

Retrospective analysis – the analysis of error. Used for accident analysis.

Prospective analysis – predicting possible error. Used for accident prediction.

Page 34: Final

COMMON PERFORMANCE CONDITIONS

Humans action can be correct or incorrect but also occur within the context of situation.

Context can greatly effect an persons actions. Cream breaks down context into 9 criteria.

Adequacy of organization

Working conditions

Adequacy of MMI and operational support

Availability of procedures/ plans

Number of simultaneous goals

Available time

Time of day (circadian rhythm)

Adequacy of training and expertise

Crew collaboration quality

After context has been established analysis can begin

Page 35: Final

COMMON PERFORMANCE CONDITIONS

Page 36: Final

ANALYSIS

CREAM defines error as follows:

Phenotype – An error that is a physical action that can be measured and observed.

Genotype – The errors possible cause influenced by context.

These boundaries greatly reduce the inconsistency between different analysts.

Page 37: Final

ANALYSIS

Cream describes how errors happen through the following terminology:

Antecedent – the cause of the error.

Consequent – the effect of the error.

Each antecedent may have one to * consequent and each consequent may have one to * antecedent.

Using a table of varies antecedents and consequents an analysis of the accident can be built.

(Serwy, Rantanen, & Hollnagel)

Page 38: Final

MAN-TECHNOLOGY-ORGANIZATION (MTO) TRIAD

The contextual antecedents and consequents are split into three categories:

Man – physical and cognitive limitations of person.

Technology – technological failure

Organization – failure of the organization in which the situation exists.

At each stage of the analysis there are several options to proceeded, due to the context stage some of these option are more likely.

simplifies analysis processes.

Page 39: Final

HOW TO DO CREAM

The CREAM technique can be used for both retrospective and prospective analysis. Here is how to use it:

1) Identify the Common Performance Conditions, under 'CPC’

2) Start with a genotype "Error Mode" (with retrospective) or a phenotype "MTO triad" (with prospective) under 'Workspace’

3) For each step, select a Specfic Consequent to better explain the step.

4) For retrospective analysis, if there is enough information to select a specific antecedent, then do so. The analysis stops for that branch.

5) Continue with each step of the analysis, exploring all the likely paths as shown in the left panel of the Workspace.

(Serwy, Rantanen, & Hollnagel)

Page 40: Final

COMMON PERFORMANCE CONDITIONS –KEGWORTH

Page 41: Final

RETROSPECTIVE ANALYSIS - KEGWORTH

Page 42: Final

RETROSPECTIVE ANALYSIS - KEGWORTH

Page 43: Final

EVALUATION

Specific antecedent were found to be:

• Lack of knowledge of the aircraft

• Inadequate training of the flight crew

• Design failure of the aircraft( no visibility of engines)

• Competing tasks – cabin workload to high.

Page 44: Final

ADVANTAGES OF CREAM

Allows for the context of the accident to be taken into account. Shows how the context in which people work effect there actions.

Can effectively do both Retrospective and Prospective analysis. Only need to learn once as they used the same simple principles.

A good structure that keeps inconsistency between different analyst low.

Page 45: Final

DISADVANTAGES OF CREAM

Resource hungry, requires a long period of time to complete.

Need to have a good level of exposure accident analysis in particular the human factors.

No guidance on how the errors you have found can be reduced.

Page 46: Final

USEFUL RESOURCES

Software tool for CREAM analysis.

http://www.ews.uiuc.edu/~serwy/cream/v0.6.1/

Evaluation of software(tells you how to use it)

Page 47: Final

FRAM - FUNCTIONAL RESONANCE ANALYSIS METHOD

Background:

Developed by Erik Hollnagel in 2004

Performance variability

Performance in a system whither internal, external dynamically fluctuates. Variability in complex systems is normal.

Key idea:

Models how components of a system resonate and interact with each other causing the system to lose balance leading to accidents.

(Hollnagel E. , FRAM – The Functional Resonance Analysis Method, 2012)

Page 48: Final

FRAM ANALYSIS0. Define the purpose of modeling and describe the situation being analyzed. An event that has occurred (incident/accident) or a possible future scenario (risk).

1. Identify the essential functions in the event ('foreground' functions when things go right); characterize each by six basic aspects.

2. Characterize the actual / potential variability of 'foreground' functions and 'background' functions (context). Consider both normal and worst case variability.

3. Define functional resonance based on potential / actual dependencies (couplings) among functions.

4. Propose ways to monitor and dampen performance variability(indicators, barriers, design / modification, etc.)

Page 49: Final

Non-normal event(Engine Failure)

Non-normal procedures

Air conditioning smoke

High engine vibrations

Engine shutdown checklist

High engine vibrations procedures

Air conditioning smoke procedures

Engine shutdown

Divert to nearest airport

Landing procedure

Landing

Review any engine shutdown decisions

Page 50: Final

T

O

C

RP

I

Non-normal event

(Engine Failure)

T

O

C

RP

I Non-normal procedures

T

O

C

RP

I

Air conditioning

smoke

T

O

C

RP

IHigh engine vibrations

T

O

C

RP

I

Air conditioning

smoke procedures

T

O

C

RP

IEngine

shutdown T

O

C

RP

I

Engine shutdown checklist

T

O

C

RP

I

Divert to nearest airport

T

O

C

RP

ILanding

procedureT

O

C

RP

I Landing

T

O

C

RP

I High engine

vibrations procedures

T

O

C

RP

I

Review any engine

shutdown decisions

Page 51: Final

CHARACTERISTICS OF FUNCTION VII

T

O

C

RP

I Divert to nearest airport

ControlNon-normal procedures ,Commander and first officers’Actions. Boeing 737 operationsManual.

OutputInput to:Landing procedure

Resource Commanders and first officers’ attention and time, cabin crews attention, air traffic control and ground crew manpower

PreconditionAir traffic control clearance

InputAir conditioning

smoke procedures

TimeMust divert immediately,

top priority.

Page 52: Final

T

O

C

RP

I

Non-normal event

(Engine Failure)

T

O

C

RP

I Non-normal procedures

T

O

C

RP

I

Air conditioning

smoke

T

O

C

RP

IHigh engine vibrations

T

O

C

RP

I

Air conditioning

smoke procedures

T

O

C

RP

IEngine

shutdown

T

O

C

RP

I

Engine shutdown checklist

T

O

C

RP

I

Divert to nearest airport

T

O

C

RP

ILanding

procedure T

O

C

RP

I Landing

T

O

C

RP

I High engine

vibrations procedures

T

O

C

RP

I

Review any engine

shutdown decisions

Left engine malfunctioned

Engine vibration producers were not

carried out

Pilots experienced symptoms of engine

failure

Pilots experienced symptoms of engine

failure

High cabin workload

Shut down of right engine was not

reviewed

Pilots did not know of newly introduced engine vibration

procedure

Engine vibration producers were not

carried out

Pilots were required to

land a review engine

shutdown decisions

Flight crews attention focused on diverting to

nearest airport

Flight crews chose to deal with smoke

Determined that right engine was cause

symptoms stopped

symptoms stopped

Page 53: Final

EVALUATION

Harmful interactions were found to be:

• Inadequate training of the flight crew, did nor know of certain protocols

• Competing tasks – cabin workload to high.

Page 54: Final

ADVANTAGES OFFRAM

Guides the investigation tem to ask more questions rather than just looking for answers.

Can effectively do both Retrospective and Prospective analysis. Only need to learn once as they used the same simple principles.

Takes it to account the system in which the accident occurred.

Page 55: Final

DISADVANTAGES OF CREAM

Resource hungry, requires a long period of time to complete.

Need to have a good level of exposure accident analysis in particular the human factors.

Does not find rote cause, further analysis is needed to determine this.

Page 56: Final

REFERENCES Marais, K., Dulac, N., & Leveson, N. (2004). Beyond Normal Accidents and High Reliability Organizations: The Need for an Alternative Approach to Safety in Complex Systems. Cambridge.

Air Accidents Investigation Branch. (2012). June 2012 Bulletin. Aldershot: Air Accidents Investigation Branch.

Air Accidents Investigation Branch. (1989). Report on the Accident to Boeing 737-400 G-OBME near Kegworth, Leicesterhire on 8 Janury 1989. Aldershot: Air Accidents Investigation Branch.

Amalberti, R. (1996). La conduite des systkmes ri risques. Paris: PUF.

Australian Transport Safety Bureau. (2008). In-flight upset 154 km west of Learmonth, WA 7 October 2008 VH-QPA Airbus A330-303. Canberra: Australian Transport Safety Bureau.

Board, Columbia Accident Investigation. (2003). Columbia Accident Investigation Board Vol 1. Washington, D.C: Columbia Accident Investigation Board.

CME Group. (2010). What Happend on May 6th? Chicago: CME Group.

Department of Energy. (1999). DOE Workbook, Conducting Accident Investigations . Washington,: Department of Energy.

Dulac, N. (2007). A Framework for Dynamic Safety and Risk Management Modeling in Complex Engineering Systems. Cambridge: MIT.

Easley, D., Lopez de Prado, M. M., & O'Hara, M. (2012). Flow Toxicity and Liquidity in a High Frequency World. Review of Financial Studies , 1457-1493.

Easley, D., Lopez de Prado, M. M., & O'Hara, M. (2010). The Microstructure of the ‘Flash Crash’: Flow Toxicity, Liquidity Crashes and the Probability of Informed Trading. he Journal of Portfolio Management , 118-128.

Ferry, T. (1988). Modern Accident Investigation and Analysis. Second Edition. New York: Wiley.

Gouran , D. S., Hirokawa,, R. Y., & Martz, A. E. (1986). A critical analysis of factors related to decisional processes involved in the challenger disaster. Central States Speech Journal , 37.

Høyland, A., & Rausand, M. (1994). System reliability Theory: Models and Statistical Methods. New York: Wiley.

Heimann, C. F. (1993). Understanding the Challenger Disaster: Organizational Structure and the Design of Reliable Systems. The American Political Science Review , 87, 421-435.

Page 57: Final

Hollnagel, E. (1998). Cognitive Reliability and Error Analysis Method. Oxford: Elsevier Science Ltd.

Hollnagel, E. (2012). FRAM – The Functional Resonance Analysis Method. Farnham: Ashgate.

Hollnagel, E. (2005). Functional Resonance Accident Model Method and examples. COGNITIVE SYSTEMS ENGINEERING LABORATORY . University of Linköping.

Hollnagel, E. (2002). Understanding accidents-from root causes to performance variability. Human Factors and Power Plants, 2002. Proceedings of the 2002 IEEE 7th Conference on , (pp. 1 - 1-6 ).

Hopkins, A. (2006, December). Studying organisational cultures and their effects on safety. Safety Science , 44, pp. 875-889.

Keong, T. H. (1997, July 9). Risk Analysis Methodologies. Retrieved June 8, 2012, from pacific.net.sg: http://home1.pacific.net.sg/~thk/risk.html

Kim, M., Seong, P., & Hollnagel, E. (2006). A probabilistic approach for determining the control mode in CREAM. Reliability Engineering and System Safety , 191-199.

Lehto, M. (1991). Models of accident causation and their application: Review and reappraisal. journal of engineering and technology management , 173.

Leveson, N. G. (2004). A new accident model for engineering safer systems. Safety Science , 237-270.

Perrow, C. (1984). Normal Accidents: Living With High-Risk Technologies. New york: Basic books.

PRESIDENTIAL COMMISSION on the Space Shuttle Challenger Accident. (1986). Report of the PRESIDENTIAL COMMISSION on the Space Shuttle Challenger Accident. Washington, D.C.: PRESIDENTIAL COMMISSION on the Space Shuttle Challenger Accident.

Qureshi, Z. H. (2007). A review of accident modelling approaches for complex socio-technical systems. SCS '07 Proceedings of the twelfth Australian workshop on Safety critical systems and software and safety-related programmable systems (pp. 47-59). Darlinghurst: Australian Computer Society.

Rasmussen, J. (1997). Risk management in a dynamic society: a modelling problem. Safety Sci. , 183–213.

Reason, J. (1997). Managing the Risks of Organizational Accidents. Aldershot: Ashgate.

Serwy, R. D., Rantanen, E. M., & Hollnagel, E. (n.d.). How to do CREAM. Retrieved August 3, 2012, from Cognitive Reliability Error Analysis Method Web Demonstration Version 0.6: http://www.ews.uiuc.edu/~serwy/cream/v0.6.1/

Sklet, S. (2003). Comparison of some selected methods for accident investigation. Journal of hazardous materials , 29-37.

Smith, D. (2000). On a wing and a prayer? Exploring the human components of technological failure. Syst. Res. , 543–559.

Svedung, I., & Rasmussen , J. (2002). Graphic representation of accident scenarios: mapping system structure and the causation of accident. Safety Science , 397±417.

Svenson, O. (2001). Accident and Incident Analysis Based on the Accident Evolution and Barrier Function ( AEB) Model. Cognition, Technology & Work , 42-52.

Svenson, O. (1991). The Accident Evolution and Barrier Function (AEB) Model Applied to Incident Analysis in the Processing Industries. Risk Analysis , 499–507.

The staffs of the U.S. Commodity Futures Trading Commission and the U.S. Securities and Exchange Commission. . FINDINGS REGARDING THE MARKET EVENTS OF MAY 6, 2010 . Washington, D.C : U.S. Commodity Futures Trading Commission and the U.S. Securities and Exchange Commission.

Øien, K. (2001). Risk indicators as a tool for risk control. Reliability Engineering & System Safety , 129–145.