supervisors field guide to human error investigations
TRANSCRIPT
Supervisors Field Guide to Human Error Investigations
When things go wrong in organizations, one thing is almost always found in the post-mortem: “human error” (in various guises). But one only needs to scratch the surface of
system failures to understand that things are not so straightforward. What seems to make sense as a causal catch-all for our everyday slips and blunders snaps when stretched; it fails to explain the context and complexity of our work and systems.
There is a better way. It is both humanistic and systemic; it treats people holistically and non-judge mentally, while considering system conditions and dynamics in context. If you are prepared to suspend your own preconceptions and reactions to failure this book will repay you with a practical, highly readable and deeply humane approach to dealing with
failure.’
So are you repairing or fixing your incidents and helping
Real progress in safety lies in seeing the similarities between events which may highlight particular patterns toward breakdown
Two Views of Error
• The Old View and the New View• The Old View-
– Human error is a cause of accidents- in the past, accidents used to conclude with “the cause of the accident was the pilot’s failure to…” (human error)
– To explain failure investigations must seek failure
– They must find people’s inaccurate assessments, wrong decisions, and bad judgments
The Bad Apple Theory
• A way of summarizing the old view might be called the bad apple theory:– Complex systems (ie. Drilling rigs) would be fine were
it not for the erratic behavior of some unreliable people (bad apples)
– Human errors cause accidents; humans are the dominant contributor to more than two thirds of them
– Failures come as unpleasant surprises- they do not belong in the system and are caused by the inherent unreliability of people
Terms
• Sharp end- that part of an organization where failures play out, or in the maintenance shop
• Blunt end- that part of an organization that supports, drives, and shapes activities at the sharp end (scheduling department, crew training, personnel, etc.)
“Fixes” proposed by the old view
• Tighten procedures
• Introduce more technology
• Get rid of the bad apples- fire them or move them around
Why then is view prevalent?
• It requires little effort, requires little thought; i.e. it is easy to fire someone
• The illusion of omnipotence- that people can simply choose between making or not making errors independent of their environment
How to counter the bad apple theory
• Understand the concept of local rationality- that people usually perform their tasks in a manner that seems logical, reasonable, and rational at the time- they do not intend to fly into a mountain or over-run a runway
• The astute investigator needs to attempt to determine why erroneous actions made sense at the time.
Reactions to Failure
• It is difficult for those on the outside of an accident to not “react” to a failure- “How could they not have seen that, etc.”
• We must understand that reactions are:– Retrospective- we can look back and see the
outcome– Proximal- it’s easy to focus on those who were
closest to the event– Counterfactual- it’s easy to lay out in detail what
should have been done differently, but knowing the outcome destroys our objectivity
– Judgemental- “They SHOULD have done…”
Retrospective
• Looking back you can:– Know the outcome– Know which cues were the critical cues– What could have done to prevent the
occurrence
– *Recognize that events look differently as they unfold
Cause
• In any organization, after an accident, there usually are significant pressures to find “cause”– People want to know how avoid the same
trouble– People want to start investigating
countermeasures– People may seek retribution, justice
2 myths driving the causal search
• It is thought that there is always “The” cause- cause is something you construct, not something you find
• It is often thought that we can make a distinction between human cause and mechanical cause- the pathways are actually quite blurred
Human Error by any other name
• Often “Human Error” is given other names which are almost as useless:– Loss of CRM– Loss of Situational Awareness– Complacency– Non-compliance
– *These all identify the “what” but not the “why”
Beneath the labels
• Investigators need to understand what’s behind the labels i.e.:– How perception shifts based on earlier
assessments or future expectations– Trade-offs people are forced to make
between operational goals– How people are forced to deal with complex,
cumbersome technology
Error in the Head or World
• Where should investigators begin looking for the source of the error– In the head of the person committing the error– In the situation in which the person was
working
– If we start with the head what good does that do?
Looking in the Environment
• If we look at the environment, connections are revealed:– We see how the evolving situation changed
people’s behavior providing new evidence, new cues, updates people’s understanding, presents more difficulties.
– This opens or closes pathways to recovery
Looking in the environment we can:
• Show how the situation changed over time• Show how people’s assessments and
actions evolved in parallel with their changing situation
• How features of people’s tools and tasks and their environment (both organizational and operational) influenced their assessments and actions inside that situation
Putting data in context
• 2 concepts are important here, micro-matching, and cherry picking.– Micromatching- placing people’s actions
against a world you now know to be true (after-the-fact world- little related to the actual world at the time)
– Cherry Picking- lumping selected bits of information together under one condition you have identified in hindsight.
Cherry Picking
• Understand that there is a difference between data availability and data observability– Data Availability- what can be shown to have been
physically available somewhere in the situation– Data observability- what would have been observable
given the features of the interface and the multiple integrated tasks, goals, interests, knowledge and even the culture of people looking at it.
The New View
• Human Error is a symptom of trouble deeper inside a system
• To explain failure, do not try to explain where people went wrong
• Instead, investigate how people’s assessments and actions would have made sense at the time, given the circumstances that surrounded them
New View
• Human error is not the cause, it is the effect or symptom of deeper trouble
• Human error is not random, it is systematically connected to features of people’s tools, tasks and operating environment
• Human error is not the conclusion of an investigation, it is the beginning
New View
• Safety is never the only goal in systems that people operate. Goals are multiple (schedules, economic, competition, etc.)
• Trade-offs between safety and other goals often must be made under uncertainty and ambiguity. People decide to “borrow” from the safety goal to accomplish these other goals
• Systems are not basically safe, people create safety by adapting under pressure and acting under uncertainty
New View- People
• People are vital to “negotiating” safety under these circumstances
• Under these conditions, human error should be expected
New View of Error
• Errors/Failures should be treated as:– A window on a problem which might happen
again– A red flag in the everyday operation of a
system and an opportunity to learn about the conditions which caused the failure potential
New View Recommendations
• Seldom focus on individuals- everyone is potentially vulnerable
• Do not focus on tightening procedures- individuals need discretion to deal with complex operations
• Do not get trapped in the promise of new technology (which will present new opportunities for error)
• Speak in systemic terms- organizational conditions, operational conditions, or technological features
Human Data
• The problem of the previous method lies in human memory:– Memory is not like a tape which can be
rewound– Often it is impossible to separate actual
events and cues which were observed from later inputs
– Human memory tends to order and structure events more than they actually were- we add plausibility to fill in gaps
Human Data
• Participants should be allowed to tell their story with questions from the investigator such as:– What were you seeing?– What were you focusing on?– What were you expecting to happen?
– What pressures were you experiencing?
– Were you making any operational trade-offs?– Were you trained to deal with this situation?– Were you reminded of any previous experience?
Reconstructing the Unfolding Mindset
• Lay out the sequence of events in time• Divide the sequence of events into episodes• Find the data you now know to have been
available to people during the episode- was the right data available? Was it complete?
• Identify what was observed during the event and why it made sense (particularly harsh or salient cues will attract attention even if they are little understood at the time)- the hard part
Patterns of Failure
• Technology- new technology doesn’t eliminate human error, it changes it- attention slips from managing the process to managing the automation interface
• Automation relies on monitoring- something humans aren’t good at for infrequent events
• Many automated systems provide users with little feedback allowing operators to detect discrepencies.
Drift
• Accidents don’t just occur, they are the result of an erosion of margins that went unnoticed- less defended systems are more vulnerable
• Often the absence of adverse consequences of violations lead people down the wrong path- “the normalization of deviance”- to understand why we need to understand the complexity behind the violation
• Recognize that Safety is not a constant- what causes an accident today may not tomorrow
Writing Recommendations
• Can be “high end” (recommending the reallocation of resources) or “low end” (changing a procedure)
• The easier a recommendation can be sold, the less effective it will be- true solutions are seldom simple and are usually costly
• Recommendations should focus on change not “diagnosis”
Learning from Failure
• Use Outside “objective” auditors• Avoid accepting errors as “just human”• Avoid “setting an example” of individual
failures- this just makes people avoid reporting errors
• Avoid Compartmentalization- seek to find commonalities in failure
• Avoid passing the buck- safety is everyone’s problem
In Summary
• You cannot use the outcome of a sequence of events to assess the quality of the decisions and actions that led up to it
• Don’t mix elements from your own reality now into the reality that surrounded people at the time. Resituate performance in the circumstances that brought it fourth
Summary
• To understand human performance, you must understand how the situation unfolded around people at the time- try to understand how people’s actions made sense at the time
• Remember the point of a human error investigation is to understand “why” not to judge them for what they did not do.