final
TRANSCRIPT
LSCITS SUMMER INTERNSHIP 2012 - ACCIDENT MODELING ROSS APTED
AIM
To compare existing methods for modeling (and/or predicting) failure in real world, complex systems.
Research and summarize an number off failures and accidents involving complex socio- technical
systems.
Research and summarize several modeling approaches
Apply selected modeling approaches to chosen accident
BREAKDOWN
3 Weeks: Researching failures and accidents.
- Columbia and Challenger Disasters
- 2010 Flash Crash
- Aviation accidents and incidents
4 Weeks: Researching ancient modeling approaches
5 Weeks: Modeled a well documented accident using various systemic and sequential accident
modeling techniques.
Above image shows the Columbia disintegrating over Texas
FAILURES AND ACCIDENTS
CRITERIA FOR SELECTION
The accident was well documented
Widely discussed in academic literature.
SPACE SHUTTLE COLUMBIA DISASTER
On the 1st February 2003 A critical systems failure occurred on the space shuttle Columbia (STS - 107) on its re – entry to the earth’s atmosphere.
This caused the disintegration of the shuttle leading to the death of all seven crew members.
STS-107 flight insignia
FLASH CRASH 2010
At approximately 2:45 on 6th May 2010 Prices on the United States stock market fell sharply only to recover minutes later.
The Dow Jones dropped 600 points during the crash adding to 300 point drop that day(due to Greek debt crisis).
Most of the 600 point drop was recovered within tens of minutes.
Dow Jones – Important Index of the stock of 30 large companies that are representative of the United states economy. Represents state of market.
(The staffs of the U.S. Commodity Futures Trading Commission and the U.S. Securities and Exchange Commission. )
KEGWORTH AIR DISASTER
On 8th January 1989 British midland flight 92 crashed while undertaking an emergency landing.
Crashed site : M1 embankment near the village of Kegworth.
The Boeing 737 -400 aircraft was severely damaged 79 of the of the 126 people aboard the plane survived.
An investigation was carried out by the Air Accidents Investigation Branch (AAIB). (Air Accidents Investigation Branch, 1989)
EVENTS OF CRASH
1. Moments after reaching cruising attitude fan blade broke off causing decrease in power and increase in vibrations. This caused the left engine to produced a jet of flames.
2. Smoke flooded into the cabin. Captain shut down the engine on the right.
3. Smell of smoke and vibrations reduced.
4. Crew diverted to Midlands Airport. Left engine completely failed during the descent of the emergency landing
Right engine was shut down
In adequate training
Left engine failed
Insufficient knowledge of aircraft
Crash
Improper design testing
CONTRIBUTING FACTORS
Inadequate knowledge of the aircraft
Flight crew observed smoke in the cabin.
Believed they could not trust the Vibration sensors. Was true of the old Boeing 737 but not the new 737-400.
Indicates the state of the engines.
Fell back on general knowledge of aircraft which was wrong. Thought that bleed air(pressure and heating) was taken from the right engine.
In fact the air conditioning systems utilized both engines in the new model.
BOEING 737 (OLD)
Key - bleed air via air
conditioning
Right engine
BOEING 737-400 (NEW)
Key - bleed air via air
conditioning
Right engine Left engine
CONTRIBUTING FACTORS In adequate training
The combination of violent engine vibrations and the smell of smoke while climbing to covered attitude was not covered in training.
Two separate protocols existed for each event but not in conjunction.
No simulation training for engine failure of this kind, or what to do if the situations fall out of bounds of standard procedures.
Differences in the Boeing 737 and 737-400 were not adequately taught.
WHY THE MISTAKE WAS NOT FOUND
By chance the the smoke dissipated and the vibrations reduced – this was actually due to standard procedure reducing fuel flow to both the engines.
Pilots did not communicate with the cabin crew who had visual confirmation of which engine was damaged.
Immediate division to Midlands airport create a high cabin workload this resulted in incorrect review procedure after the right engine was shut down.
SELECTED METHODS FOR ACCIDENT INVESTIGATION
TYPES OF ACCIDENT MODEL
Main types of Accident model. (Hollnagel, 2002)
Sequential
Epidemiological
Systemic
SEQUENTIAL ACCIDENT MODELS
Simplest form of accident modeling.
Describes the accident as a series of events that occur in a particular order.
Events occur along a linear timeline.
Analysis: Identifies specific cause and broken links in accident chain. Goal is to eliminate broken links.
Fault tree analysis, Domino Model of accident causation, Events and causal factors charting Event tree analysis, Management and Oversight Risk Tree (MORT), Sequential Timed Events Plotting (STEP),Man, Technology and Organization (MTO)-analysis, TRIPOD
SEQUENTIAL ACCIDENT MODELS SUMMARY
Advantages:
Human readable, easy to communicate chain of events.
Can identify root cause or break in chain of events that lead to accident.
Good starting of point.
Disadvantages:
Does not take into account latent factors.
Inadequate to model the variability of Sociotechnical systems.
EPIDEMIOLOGICAL ACCIDENT MODEL SUMMARY
Accident is described as a disease.
Some factor that effects the accident occur right away while others are latent.
Takes into account that events can manifest over time
Swiss cheese Model (Reason, 1997)
EPIDEMIOLOGICAL ACCIDENT MODEL SUMMARY
Overcome Limitations:
Superior to sequential models as latent events can be taken into account.
More suited to modeling complex systems.
Lack of detail:
Allowed the idefaction of general events that occurred could not go deeper.
SYSTEMIC ACCIDENT MODEL SUMMARY
Accidents naturally emerge, they are expected to occur. As detailed In Perrow’s Normal Accidents.(Perrow, 1984)
Focus:
Systemic models focus on the characteristics of a systems as oppose to a series of events that cause the accident in the system.
Difficult but powerful:
Ideal for complex systems but hard to represent graphically.
SYSTEMIC ACCIDENT MODEL SUMMARY
Considers the performance of the system as a whole.
Organization
Environmental
Human
Technical
System is view as many components interacting causing a equilibrium.
Systemic can evolve dynamically
Flawed interactions between components could cause system to be thrown out of balance
Accident
SYSTEMIC ACCIDENT MODEL SUMMARY
Cognitive Reliability Error Analysis Method (CREAM) (Hollnagel E. , Cognitive Reliability and Error Analysis Method., 1998)
The Functional Resonance Analysis Method (FRAM)(Hollnagel E. , FRAM – The Functional Resonance Analysis Method, 2012)
AcciMap(Rasmussen, 1997)
Systems-Theoretic Accident Model and Processes (STAMP) (Leveson, 2004)
APPLY SELECTED MODELING APPROACHES TO CHOSEN ACCIDENT
FAULT TREE ANALYSIS
Graphical representation of normal events, system failures, human errors and environmental factors.
Logic gate are used to construct chains of events.
Used to identify sequences off failure.
Identifies root cause.
(Høyland & Rausand, 1994)
Engine failure
High power
setting in flight
Fan blade fracture
Heavy vibrationsMetal fatigue
Heavy vibrations Flawed engine design
In adequate testing in
high
Engine vibrations sensor failure
Wrong engine shutdown (right engine)
No protocols in
place to deal with
simultaneous symptom of vibration and smoke.
Equipment failureJudgment error
Insufficient protocols Poor aircraft design
No way to get visual
conformation from cockpit
Inadequate
maintenance
Inadequate training
Pilots did not know that the aircraft had a different air-
condition
system
Other tasks
Pilots did no re-evaluate engine
switch of decision due to high cabin work
load.
British Midland Flight BD 92 crash landing
Engine failure
High power setting in
flight
Fan blade fracture
Heavy vibrationsMetal fatigue
Heavy vibrations Flawed engine design
In adequate testing in
high
Engine vibrations sensor failure
Wrong engine shutdown (right engine)
No protocols in place to deal
with simultaneous symptom of vibration and
smoke.
Equipment failureJudgment error
Insufficient protocols Poor aircraft design
No way to get visual
conformation from cockpit
Inadequate maintenance
Inadequate training
Pilots did not know that the aircraft had a different air-
condition system
Other tasks
Pilots did no re-evaluate
engine switch of decision due to high cabin work load.
Engine vibrations sensor failure
Wrong engine shutdown (right engine)
No protocols in place to deal
with simultaneous symptom of vibration and
smoke.
Equipment failureJudgment error
Insufficient protocols Poor aircraft design
No way to get visual
conformation from cockpit
Inadequate maintenance
Inadequate training
Pilots did not know that the aircraft had a different air-
condition system
Other tasks
Pilots did no re-evaluate
engine switch of decision due to high cabin work load.
ADVANTAGES AND DISADVATGES
Advantages:
Root cause can be easily be identified.
Human readable easy to communicate events that lead to accident.
Disadvantages:
Does not take into account latent conditions.
Does not take into account the environment in which the ancient occurred
CREAM - COGNITIVE RELIABILITY AND ERROR ANALYSIS METHOD
Background:
Developed by Erik Hollnagel in 1998
Cognitive system engineering approach
design of human-machine systems accounting for factors of the environment in which the system exists.
Key idea:
Cognitive modeling of human performance for accident analysis or performance predictions
(Hollnagel E. , Cognitive Reliability and Error Analysis Method., 1998)
HOW CAN IT BE USED
CREAM is a bi – directional analysis method.
Retrospective analysis – the analysis of error. Used for accident analysis.
Prospective analysis – predicting possible error. Used for accident prediction.
COMMON PERFORMANCE CONDITIONS
Humans action can be correct or incorrect but also occur within the context of situation.
Context can greatly effect an persons actions. Cream breaks down context into 9 criteria.
Adequacy of organization
Working conditions
Adequacy of MMI and operational support
Availability of procedures/ plans
Number of simultaneous goals
Available time
Time of day (circadian rhythm)
Adequacy of training and expertise
Crew collaboration quality
After context has been established analysis can begin
COMMON PERFORMANCE CONDITIONS
ANALYSIS
CREAM defines error as follows:
Phenotype – An error that is a physical action that can be measured and observed.
Genotype – The errors possible cause influenced by context.
These boundaries greatly reduce the inconsistency between different analysts.
ANALYSIS
Cream describes how errors happen through the following terminology:
Antecedent – the cause of the error.
Consequent – the effect of the error.
Each antecedent may have one to * consequent and each consequent may have one to * antecedent.
Using a table of varies antecedents and consequents an analysis of the accident can be built.
(Serwy, Rantanen, & Hollnagel)
MAN-TECHNOLOGY-ORGANIZATION (MTO) TRIAD
The contextual antecedents and consequents are split into three categories:
Man – physical and cognitive limitations of person.
Technology – technological failure
Organization – failure of the organization in which the situation exists.
At each stage of the analysis there are several options to proceeded, due to the context stage some of these option are more likely.
simplifies analysis processes.
HOW TO DO CREAM
The CREAM technique can be used for both retrospective and prospective analysis. Here is how to use it:
1) Identify the Common Performance Conditions, under 'CPC’
2) Start with a genotype "Error Mode" (with retrospective) or a phenotype "MTO triad" (with prospective) under 'Workspace’
3) For each step, select a Specfic Consequent to better explain the step.
4) For retrospective analysis, if there is enough information to select a specific antecedent, then do so. The analysis stops for that branch.
5) Continue with each step of the analysis, exploring all the likely paths as shown in the left panel of the Workspace.
(Serwy, Rantanen, & Hollnagel)
COMMON PERFORMANCE CONDITIONS –KEGWORTH
RETROSPECTIVE ANALYSIS - KEGWORTH
RETROSPECTIVE ANALYSIS - KEGWORTH
EVALUATION
Specific antecedent were found to be:
• Lack of knowledge of the aircraft
• Inadequate training of the flight crew
• Design failure of the aircraft( no visibility of engines)
• Competing tasks – cabin workload to high.
ADVANTAGES OF CREAM
Allows for the context of the accident to be taken into account. Shows how the context in which people work effect there actions.
Can effectively do both Retrospective and Prospective analysis. Only need to learn once as they used the same simple principles.
A good structure that keeps inconsistency between different analyst low.
DISADVANTAGES OF CREAM
Resource hungry, requires a long period of time to complete.
Need to have a good level of exposure accident analysis in particular the human factors.
No guidance on how the errors you have found can be reduced.
USEFUL RESOURCES
Software tool for CREAM analysis.
http://www.ews.uiuc.edu/~serwy/cream/v0.6.1/
Evaluation of software(tells you how to use it)
FRAM - FUNCTIONAL RESONANCE ANALYSIS METHOD
Background:
Developed by Erik Hollnagel in 2004
Performance variability
Performance in a system whither internal, external dynamically fluctuates. Variability in complex systems is normal.
Key idea:
Models how components of a system resonate and interact with each other causing the system to lose balance leading to accidents.
(Hollnagel E. , FRAM – The Functional Resonance Analysis Method, 2012)
FRAM ANALYSIS0. Define the purpose of modeling and describe the situation being analyzed. An event that has occurred (incident/accident) or a possible future scenario (risk).
1. Identify the essential functions in the event ('foreground' functions when things go right); characterize each by six basic aspects.
2. Characterize the actual / potential variability of 'foreground' functions and 'background' functions (context). Consider both normal and worst case variability.
3. Define functional resonance based on potential / actual dependencies (couplings) among functions.
4. Propose ways to monitor and dampen performance variability(indicators, barriers, design / modification, etc.)
Non-normal event(Engine Failure)
Non-normal procedures
Air conditioning smoke
High engine vibrations
Engine shutdown checklist
High engine vibrations procedures
Air conditioning smoke procedures
Engine shutdown
Divert to nearest airport
Landing procedure
Landing
Review any engine shutdown decisions
T
O
C
RP
I
Non-normal event
(Engine Failure)
T
O
C
RP
I Non-normal procedures
T
O
C
RP
I
Air conditioning
smoke
T
O
C
RP
IHigh engine vibrations
T
O
C
RP
I
Air conditioning
smoke procedures
T
O
C
RP
IEngine
shutdown T
O
C
RP
I
Engine shutdown checklist
T
O
C
RP
I
Divert to nearest airport
T
O
C
RP
ILanding
procedureT
O
C
RP
I Landing
T
O
C
RP
I High engine
vibrations procedures
T
O
C
RP
I
Review any engine
shutdown decisions
CHARACTERISTICS OF FUNCTION VII
T
O
C
RP
I Divert to nearest airport
ControlNon-normal procedures ,Commander and first officers’Actions. Boeing 737 operationsManual.
OutputInput to:Landing procedure
Resource Commanders and first officers’ attention and time, cabin crews attention, air traffic control and ground crew manpower
PreconditionAir traffic control clearance
InputAir conditioning
smoke procedures
TimeMust divert immediately,
top priority.
T
O
C
RP
I
Non-normal event
(Engine Failure)
T
O
C
RP
I Non-normal procedures
T
O
C
RP
I
Air conditioning
smoke
T
O
C
RP
IHigh engine vibrations
T
O
C
RP
I
Air conditioning
smoke procedures
T
O
C
RP
IEngine
shutdown
T
O
C
RP
I
Engine shutdown checklist
T
O
C
RP
I
Divert to nearest airport
T
O
C
RP
ILanding
procedure T
O
C
RP
I Landing
T
O
C
RP
I High engine
vibrations procedures
T
O
C
RP
I
Review any engine
shutdown decisions
Left engine malfunctioned
Engine vibration producers were not
carried out
Pilots experienced symptoms of engine
failure
Pilots experienced symptoms of engine
failure
High cabin workload
Shut down of right engine was not
reviewed
Pilots did not know of newly introduced engine vibration
procedure
Engine vibration producers were not
carried out
Pilots were required to
land a review engine
shutdown decisions
Flight crews attention focused on diverting to
nearest airport
Flight crews chose to deal with smoke
Determined that right engine was cause
symptoms stopped
symptoms stopped
EVALUATION
Harmful interactions were found to be:
• Inadequate training of the flight crew, did nor know of certain protocols
• Competing tasks – cabin workload to high.
ADVANTAGES OFFRAM
Guides the investigation tem to ask more questions rather than just looking for answers.
Can effectively do both Retrospective and Prospective analysis. Only need to learn once as they used the same simple principles.
Takes it to account the system in which the accident occurred.
DISADVANTAGES OF CREAM
Resource hungry, requires a long period of time to complete.
Need to have a good level of exposure accident analysis in particular the human factors.
Does not find rote cause, further analysis is needed to determine this.
REFERENCES Marais, K., Dulac, N., & Leveson, N. (2004). Beyond Normal Accidents and High Reliability Organizations: The Need for an Alternative Approach to Safety in Complex Systems. Cambridge.
Air Accidents Investigation Branch. (2012). June 2012 Bulletin. Aldershot: Air Accidents Investigation Branch.
Air Accidents Investigation Branch. (1989). Report on the Accident to Boeing 737-400 G-OBME near Kegworth, Leicesterhire on 8 Janury 1989. Aldershot: Air Accidents Investigation Branch.
Amalberti, R. (1996). La conduite des systkmes ri risques. Paris: PUF.
Australian Transport Safety Bureau. (2008). In-flight upset 154 km west of Learmonth, WA 7 October 2008 VH-QPA Airbus A330-303. Canberra: Australian Transport Safety Bureau.
Board, Columbia Accident Investigation. (2003). Columbia Accident Investigation Board Vol 1. Washington, D.C: Columbia Accident Investigation Board.
CME Group. (2010). What Happend on May 6th? Chicago: CME Group.
Department of Energy. (1999). DOE Workbook, Conducting Accident Investigations . Washington,: Department of Energy.
Dulac, N. (2007). A Framework for Dynamic Safety and Risk Management Modeling in Complex Engineering Systems. Cambridge: MIT.
Easley, D., Lopez de Prado, M. M., & O'Hara, M. (2012). Flow Toxicity and Liquidity in a High Frequency World. Review of Financial Studies , 1457-1493.
Easley, D., Lopez de Prado, M. M., & O'Hara, M. (2010). The Microstructure of the ‘Flash Crash’: Flow Toxicity, Liquidity Crashes and the Probability of Informed Trading. he Journal of Portfolio Management , 118-128.
Ferry, T. (1988). Modern Accident Investigation and Analysis. Second Edition. New York: Wiley.
Gouran , D. S., Hirokawa,, R. Y., & Martz, A. E. (1986). A critical analysis of factors related to decisional processes involved in the challenger disaster. Central States Speech Journal , 37.
Høyland, A., & Rausand, M. (1994). System reliability Theory: Models and Statistical Methods. New York: Wiley.
Heimann, C. F. (1993). Understanding the Challenger Disaster: Organizational Structure and the Design of Reliable Systems. The American Political Science Review , 87, 421-435.
Hollnagel, E. (1998). Cognitive Reliability and Error Analysis Method. Oxford: Elsevier Science Ltd.
Hollnagel, E. (2012). FRAM – The Functional Resonance Analysis Method. Farnham: Ashgate.
Hollnagel, E. (2005). Functional Resonance Accident Model Method and examples. COGNITIVE SYSTEMS ENGINEERING LABORATORY . University of Linköping.
Hollnagel, E. (2002). Understanding accidents-from root causes to performance variability. Human Factors and Power Plants, 2002. Proceedings of the 2002 IEEE 7th Conference on , (pp. 1 - 1-6 ).
Hopkins, A. (2006, December). Studying organisational cultures and their effects on safety. Safety Science , 44, pp. 875-889.
Keong, T. H. (1997, July 9). Risk Analysis Methodologies. Retrieved June 8, 2012, from pacific.net.sg: http://home1.pacific.net.sg/~thk/risk.html
Kim, M., Seong, P., & Hollnagel, E. (2006). A probabilistic approach for determining the control mode in CREAM. Reliability Engineering and System Safety , 191-199.
Lehto, M. (1991). Models of accident causation and their application: Review and reappraisal. journal of engineering and technology management , 173.
Leveson, N. G. (2004). A new accident model for engineering safer systems. Safety Science , 237-270.
Perrow, C. (1984). Normal Accidents: Living With High-Risk Technologies. New york: Basic books.
PRESIDENTIAL COMMISSION on the Space Shuttle Challenger Accident. (1986). Report of the PRESIDENTIAL COMMISSION on the Space Shuttle Challenger Accident. Washington, D.C.: PRESIDENTIAL COMMISSION on the Space Shuttle Challenger Accident.
Qureshi, Z. H. (2007). A review of accident modelling approaches for complex socio-technical systems. SCS '07 Proceedings of the twelfth Australian workshop on Safety critical systems and software and safety-related programmable systems (pp. 47-59). Darlinghurst: Australian Computer Society.
Rasmussen, J. (1997). Risk management in a dynamic society: a modelling problem. Safety Sci. , 183–213.
Reason, J. (1997). Managing the Risks of Organizational Accidents. Aldershot: Ashgate.
Serwy, R. D., Rantanen, E. M., & Hollnagel, E. (n.d.). How to do CREAM. Retrieved August 3, 2012, from Cognitive Reliability Error Analysis Method Web Demonstration Version 0.6: http://www.ews.uiuc.edu/~serwy/cream/v0.6.1/
Sklet, S. (2003). Comparison of some selected methods for accident investigation. Journal of hazardous materials , 29-37.
Smith, D. (2000). On a wing and a prayer? Exploring the human components of technological failure. Syst. Res. , 543–559.
Svedung, I., & Rasmussen , J. (2002). Graphic representation of accident scenarios: mapping system structure and the causation of accident. Safety Science , 397±417.
Svenson, O. (2001). Accident and Incident Analysis Based on the Accident Evolution and Barrier Function ( AEB) Model. Cognition, Technology & Work , 42-52.
Svenson, O. (1991). The Accident Evolution and Barrier Function (AEB) Model Applied to Incident Analysis in the Processing Industries. Risk Analysis , 499–507.
The staffs of the U.S. Commodity Futures Trading Commission and the U.S. Securities and Exchange Commission. . FINDINGS REGARDING THE MARKET EVENTS OF MAY 6, 2010 . Washington, D.C : U.S. Commodity Futures Trading Commission and the U.S. Securities and Exchange Commission.
Øien, K. (2001). Risk indicators as a tool for risk control. Reliability Engineering & System Safety , 129–145.