inside this issue - skyguide · inside this issue editorial ... ops management but also by t...

17
#43 July/August/September 2013 - www.skyguide.ch/en/company/vision-mission/safety/ editor: [email protected] - intranet: skyline/issue/safety/safety bulletin Inside this issue Editorial by Monika Baumgarten, SRI 2 Share the experience I instructed a go-around – Am I a bad ATCO? 3-6 Slow IFR departures 7 What makes CISM so invaluable? 8-9 Other “hot” stuff Drift into failure 10-12 Human Factors Ladder 13-15 Information OIR statistics 16 Safety prize 17

Upload: phamliem

Post on 19-Aug-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

#43 July/August/September 2013 - www.skyguide.ch/en/company/vision-mission/safety/editor: [email protected] - intranet: skyline/issue/safety/safety bulletin

Inside this issue

Editorialby Monika Baumgarten, SRI 2

Share the experienceI instructed a go-around – Am I a bad ATCO? 3-6Slow IFR departures 7What makes CISM so invaluable? 8-9

Other “hot” stuffDrift into failure 10-12Human Factors Ladder 13-15

Information OIR statistics 16Safety prize 17

safety bulletin  –  editorial

Exactly one year ago I took over the newly established position as Head Investigation Management (SRI). In the October Edition of the Safety Bulletin 2012 you can find my first editorial with the title Passion makes the difference. In this new editorial I would like to share with you why I am still so passionate about my job and what has been achieved so far.

I started with the recruitment of additional ATCO Investigator and their training, both basic education with Eurocontrol and skyguide’s internal on the job training with the great support of the RIT GVA and ZRH. Meanwhile every unit has at least one ATCO Investigator, Pierre Oberson is the Safety Occurrences Expert for the technical part and Roland Streule was hired as external consultant for Human Factors.

Today the team consists of 9 people plus one consultant. Let me take the opportunity to emphasize that I am very proud of them. It is simply a pleasure to work with my team and to benefit from the diversity of their strengths and characters. A meaningful methodology (Safety Occurrence Analysis Methodology = SOAM), a new template for inves-tigations and new internal processes are now in place in order to increase the added value of an investigation.

But what exactly is the goal of an occurrence investigation? Simply put, it is the improvement of the safety of a system. But more precisely said and transferred to “sharp end reali-ties”, for me an added value is achieved when our findings, conclu-sions and especially safety recom-mendations are taken as valuable input and not as unnecessary infor-mation overload. After one year we are on a good way. Thanks to the engagement of OZ-S, Claudio Di Palma, who created a template for the so called Management Response, every investigation is followed up by a written response of the concerned management. That means recom-mendations are not only statements on a paper anymore (and paper does not blush), they are converted into actions with information about who is responsible of doing what by when. In order to work more trans-

parently, the responses are attached to every report which is released and available via the safety page on our intranet. For example one report1 covered three different but similar occurrences. This incredible investigation work was performed by Christian Crocoll (and Roland Streule for the Human Factors part). Please read more about this investi-gation, the recommendations, the management actions and its lesson learnt in the present edition of the safety bulletin. But I am aware that too many recommendations can potentially damage a steady system. Therefore management has the pos-sibility to reject a recommendation. But the commitment to discuss re- ports and to take the recommenda-tions seriously was not only given by OPS management but also by T management which is highly appre-ciated. It is now my task to establish formal work instructions for the different process steps, roles and responsibilities. You may ask your-self now how many reports have been written within the last year. 8 in total have been released and published and 7 are under construc-tion.

What has also been achieved was a fruitful trial run with SWISS. Together with their safety depart-ment we have chosen a non-critical separation minima infringement to

2

One year as SRI

establish a joint investigation. Basil Duppenthaler (ATCO investigator) was mandated with this demanding task. We wanted to find out whether we can improve the findings and recommendations by broaden the scope of an analysis. The answer is obvious – yes, we can! I will strive for more joint investigations in the future but this needs a clear and agreed process which I will discuss with the trade unions as well.

On a personal level I achieved an improvement in my time manage-ment and in leading and motivat - ing people thanks to the basic man-agement course carried out by a consulting company called SWISS-NOVA.

It’s been a busy year and lots have been achieved. But as in daily life, every step forward means you recognize more things that have to be done. Therefore I am very much looking forward to continu-ing my work as SRI.

With best regards,

Yours,

Monika BaumgartenHead Investigation Management, SRI

1 Runway SMI GVA Summer 2012

safety bulletin  –  share the experience

I instructed a go-around – Am I a bad ATCO?

3

Instructing a “go-around” in order to get more safety buffer generates significant follow-up tasks. The aircraft is still in the air and additional handling is required: abandoning the origi-nal plan, extra communication, new planning, more time. But time is money. More flight time means more expenses for the air-line, more overall delay and most probably damages skyguides rep-utation and public image, right? So do these undesired costs com-pensate the increase in safety? Of course, this is a rhetoric ques-tion. But why is a go-around mostly an unfavoured option for an ATCO? Let’s have a look at some concrete examples.

The occurrencesIn July and August 2012, three incidents with inadequate separa-tion on concrete runway 23/05 occurred at Geneva International Airport. In each of those incidents the distance between departing and arriving aircraft fell below the minimum separation criteria.

First occurrence took place on a late afternoon in July 2012 between a Swiss Avro 100 depart-ing to Zürich and a KLM Boeing 737 landing in Geneva. The ATCO in charge of aerodrome control cleared the flight for immediate take-off on runway 05 after hav-ing received confirmation that the crew of the departing Avro was ready for departure. The KLM flight was approaching relatively fast due to tailwind and the dis-tance between the two aircraft constantly decreased until they reached the closest point at 1480 meters (minimum required run-way separation: 2400 meters). Both aircraft crews received traffic in- formation. KLM landed while Swiss was getting airborne.

Second occurrence took place just one week later on a late afternoon between a Royal Air Maroc Boe-ing 737 departing to Casablanca and an Alitalia Airbus 320 landing in Geneva. This time runway 23 was in use. RAM received a clear-ance to take-off with no delay and

was starting its take-off roll while Alitalia was on short final. The ATCO issued traffic information about the departing traffic ahead and instructed Alitalia to con-tinue its approach. Assessing that the distance between the two air-craft was decreasing rapidly, the ATCO instructed Alitalia to go around. As the crew didn’t acknowledge this instruction, the ATCO instructed RAM to abort take-off. RAM crew acknowl-edged and the ATCO instructed again Alitalia to go around. The crew of Alitalia acknowledged and initiated a go around. Third occurrence took place beginning of August at night bet-ween an Edelweiss Airbus 320 departing to Reykjavik and an Easy Jet Airbus 320 landing in Geneva on runway 23.Easy Jet was instructed to reduce to final approach speed to per- mit one departure. Edelweiss was instructed to line-up and keep ready for a rapid departure. Easy Jet was still approaching fast when

Edelweiss was cleared for immedi-ate take-off and received traffic information on the incoming air-craft. The ATCO shortly consid-ered stopping the departing air-craft but decided that it was safer to let Edelweiss depart.The minimum distance between the two aircraft reached 1639 meters (minimum required run-way separation: 2400 meters).

Same result, same reason?On first sight, the occurrences had the same “result” and it looked like there is a systematic short-coming somewhere in the system. But there wasn’t! The analysis of the incidents pointed out that a cascade and the interplay of some relevant aspects have led to the incidents, not just one single factor. In particular, the history and causative influences of several elements of the system were dif-ferent, at least to some extent. As almost always, dangerous – and not dangerous, of course – situations are the final outcome

safety bulletin  –  share the experience

I instructed a go-around – Am I a bad ATCO?

4

of interdependent factors in a dynamic system. What do you think have been the main contrib-uting factors of not instructing a go-around at all or too late?

Safety investigation is detective work - Part one: What happened?Finding out which factors con-tributed to the incident and why they did in order to deduce useful “lessons learnt” is a complex and time-consuming task. It is compa-rable to the work of a detective or a scientist. The required basis is detailed documentation of the chronology of the incidents: what has happened when and who was involved? Christian Crocoll, who was in charge for this level, gath-ered comprehensive data from SAMAX radar plots, audio record-ings of radio communication and supplementary systems like RIM-CAS. Additionally, his data col-lection also included the overall situation like amount of traffic, weather conditions or interviews with the people involved. Further-more, official guidelines and rele-vant application procedures have been documented.

Safety investigation is detective work - Part two: What were the contributing and contextual factors?Now comes the even more demanding part. Like scientists

do when they want to explain an effect, drawing assumptions or more precisely, hypothesising is the first step, followed by checking the available data, it the hypothe-sis can be supported. For example, have there been technical mal-

functions or are the application procedures inappropriate?After eliminating malfunctions on the technical side, the “human factor” remained. Instructing a (early) go-around would have been a valuable behaviour for avoiding the incidents (retrospec-tively interpreted). But why OJTI or ADC ATCO didn’t do that? Every ATCO knows about this possibility and certainly has train- ed it or even instructed a go-around in practice and still, they did not do it. For being able to explain this, cognitive psycholo-gist Roland Streule was asked to shed more light on the story. With the support of Christian Crocoll, he focused on the deci-sions ATCOs had – or had not – chosen.

Amongst the contextual factors that contributed to these occur-rences, weather and more specifi-cally changing wind conditions was involved in all three incidents. The situation of Geneva airport between the Jura Mountains and the Alps with the Lake Geneva in the middle is well known to be quite challenging to ATCO and pilots by imposing steep vertical

safety bulletin  –  share the experience

I instructed a go-around – Am I a bad ATCO?

5

evolutions due to natural obstacles and quite stressful environment due to airspace complexity and traffic density. These conditions are generally more penalising in the Summer season (thunderstorms, wind shears, turbulences) and in the Winter season (icing conditions, bad visibility and low clouds).

The fact that Geneva airport has one main concrete runway with a high traffic density does imply that every actor shall contribute actively and (most of the time) under time constraints to reach the best efficiency and optimising flight operations to and from the airport. This implies that available margins are often reduced to the absolute minimum and that sometimes, those margins even cease to exist. Pilots and ATCO accept this fact and are aware that sometimes they can be pushed outside of their comfort zone without necessarily compromis-ing safety just to be efficient and minimise delays. Crew compli-ance and rapid crew response is thus a must, but operating a com-plex system like an aircraft means that this rapidity cannot always be assured.

All cases showed that the ATCOs expectation that the crew of the departing aircraft would be more responsive was probably too high. However, the planning was cor-rect and in most similar cases, the outcome is a success and run-way separation is not infringed. Unfortunately, it is impossible to have the absolute certainty that similar traffic situations involv- ing a departure in front of an arriving aircraft will all be the same simply because far too many variable factors can influence the outcome.

In the second case (Royal Air Maroc/Alitalia) the initial situation was very similar to the other two occurrences (slow crew response, tailwind on final), but the outcome was different. The ATCO changed his plan and a go-around was imposed on the arrival and an aborted take-off on the departure. This decision was most probably triggered by the RIMCAS alert. Interestingly, RIMCAS alerts were present in all three occurrences but only one changed incited the ATCO to change his initial plan. The con-sequences of the take-off abortion were quite significant though, be -

cause Royal Air Maroc needed tech nical assistance to cool down the brakes and departed with more than 90 minutes delay, while Alita-lia had to fly a second approach and landed with 20 minutes delay.

Safety investigation is detective work - Part three: Why did it happen?Roland Streule came to the con-clusion that three main reasons most probably influenced the behaviour of the ATCOs.Firstly, contextual conditions which affect the overall attitude of an ATCO: (Social) Pressure. Oper at-ing as many aircraft as possible in a given time period is a key factor of working efficiently (and eco-nomically). But a go-around usu-ally causes follow-up tasks which come into conflict with this goal. ATCOs might fear the conse-quences like risking more com-plexity in the subsequent situation and higher workload or on the social or reputational level, “being a bad ATCO”. These negative con-sequences are undoubtedly affect-ing the attitude of ATCOs when they are assessing the situation and significantly influence their

final decision. Most likely, ATCOs are not conscious about having this attitude. Therefore, a go-around normally has low priority in the backpack of the behavioural re pertoire, increasing the risk of incidents by trend. Furthermore, Geneva International airport with only one concrete runway, its geo-graphical surrounding with the Jura Mountains and unstable weather conditions reduce the room for perfectly projectable manoeuvres, thus lowering the fault tolerance even more (as described already before).

Secondly, individual conditions of the human cognition: confir-mation bias and cognitive inertia. In the interviews, ATCOs stat- ed that they had the option of instructing a go-around in mind quite early, but still, they did not choose this option. A cognitive “law” says that once humans have chosen a strategy to reach a cer-tain goal, the cognitive flexibility to deviate from this plan is reduced, especially, if we have a lot of routine. We are cognitively inflexible because we like to finish a task as planned or as we did it in the past. For reaching this goal,

safety bulletin  –  share the experience

I instructed a go-around – Am I a bad ATCO?

6

we even exclude contradictory information which contrasts with the original plan. There have been obvious cues around which clearly pointed to the higher risk of inad-equate separation (like unstable wind conditions, uncertainty of rapid crew responses or compli-ance, RIMCAS alert), but the cog-nition of the ATCOs did not put enough relevance towards these cues.

Thirdly, individual condition of fast decision making: lack of expe-rience. In one case, the ATCO experienced quite a lot of stress and was tired. A thunderstorm increased the subjective percep-tion of the workload. The ATCO was also not so experienced, meaning that some lack of routine is making the work generally more demanding. In contrast to the other cases, the ATCO was

cognitively flexible but due to the stress, he was not able to assess the situation and options clearly and fast enough to choose the best decision.

Where to take action?In order to prevent future inci-dents, it is necessary to increase the fault tolerance of the system by adaptions of the system com-ponents, if possible. Clearly, it is not possible to blast away the Jura Mountains in order to have more space and stable wind con-ditions with no thunderstorms and dif ficult backwind condi-tions. Also, it is almost not possi-ble to reduce the amount of air-craft overnight, at least as long as you don’t have the absolute monopoly in aviation. What re- mained in these cases were taking actions on the human factor. Learning that a go-around is a

valuable and sometimes favoura-ble option with explicit training can change the attitude towards this safer option and get it off from the bottom of the priority rank order. And for beginners, of course, this will give the neces-sary experience. Go-around has to become routine by training. The management fully supported these recommendations. And by doing so, the social pressure for ATCOs is reduced automatically by some extent.

Last but not leastChristian Crocoll and Roland Streule spent many hours on ana-lysing these cases. Quite a big effort for such a simple recom-mendation, you may ask? Yes, indeed it was and sometimes the explanation of the incident and the lessons learnt are not as obvi-ous as in these cases. But how can

you know that in advance? Don’t underestimate the effect of such an analysis also. The management has to take decisions about new rules, guidelines and actions bas-ed on the recommendations from the safety investigators. But they do not have the time to get into all details of everyday work of ATCOs and still have to take the “best” decisions. This is quite a heavy responsibility. Providing them with comprehensive and reasoned data like Christian Cro-coll and Roland Streule did, allows them to clearly see the usefulness of recommendations. And if they follow them, the ATCOs get the support they need to work safely and stress-free.

Christian Crocoll ATCO investigator, SRI

Roland Streule Dr. phil., Streule Consulting

safety bulletin  –  share the experience

At SRI we get confronted with numerous reports of incidents and deviations from ATC-clear-ances. Some of greater impact where an immediate investigation becomes necessary to find out about the background and possi-bly identify systemic problems which can be addressed. Many others though with minor impact, which still doesn’t mean that they are neglected and not being looked into. In some cases records are being continuously updated to find proof for drift or areas of repetitive problems. It will then be decided to have a closer look and subsequently distribute recommendations to ATCOs, O department, procedure experts, or others.

In more or less regular intervals we received reports of SID-devia-tions of slow IFR flights like Pip-ers, Cessnas, or Cirrus. It soon became clear that there are numerous reasons for that and to which we as ANSP have no influence. The pilots were insuf-ficiently prepared or misinter-preted the SID, strong winds were ineffectively corrected and made

the aircraft drift off the desired track, the pilot got distracted by communication, operation or others.

But the problems for ATC began with the outcome of these devia-tions. As the following graphic shows the deviation from the SID brought the slow and weak per-forming IFR flight close and opposite to the inbounds of the landing runway:

In this particular incident, as well as shown in others, the transfer of radio contact from Tower (ADC) to Radar (DEP) was initiated shortly after departure while the aircraft was only about to overfly the end of runway. In this phase of flight it took the single pilot some time to switch frequencies and establish contact while being occupied with the operational needs of his aircraft.

This delay took valuable time off the controllers to inform the

pilot about his deviation, issue essential local traffic information or providing additional support. It should be added that the de - tection of a deviation by radar means takes more time than the visual observation by the tower controller.

A recommendation to maintain the aircraft on the ADC frequency was identified by SRI as no addi-tional benefit from transferring the aircraft immediately is seen, but the advantage of a quick reaction to the observation of the deviation by ADC can be ad dressed. Now it had to be decided how to inform and sensibilize the con-cerned ATCOs. Together with the O department it was agreed to use the distribution channel of the monthly OZTinfo for this matter.

The lesson learned: As ADC maintain radio contact with slow IFR departures until reasonable assurance exists that the aircraft follows the prescribed departure procedure.

York SchreiberATCO investigator, SRI

Slow IFR-Departures

7

safety bulletin  –  share the experience

What makes CISM so invaluable?

8

Critical Incident Stress Manage-ment (CISM) is a programme that is maintained not only at skyguide but by many other institutions – rescue services, transport compa-nies and similar – which might be involved, directly or indirectly, in a major incident with cata-strophic consequences or ramifi-cations. In such situations, CISM is offered by specially-trained per-sonnel (so-called “CISM peers”) to their fellow employees, to help them cope with the kind of stress reactions that are frequently caused by incidents of this kind. In doing so, CISM can generally help the people involved in such incidents to return to their work-place sooner and prevents long-term post traumatic complica-tions– which not only saves the company money, but also ensures that its business and operations can return more swiftly to a more “normal” state of affairs.

But what happens exactly when CISM is initiated? The following genuine example should cast a little more light on what CISM is and what kind of impact it can have. The account has been intentionally anonymized, because anonymity is always guaranteed in all CISM activities.

In June of this year, an incident occurred in which the separation

minima were violated between two aircraft on final approach that were being managed and moni-tored by a skyguide controller. The controller concerned was relieved of his duties shortly after the incident and met up with a trained and qualified CISM peer. The Management Of Serious Inci-dents (MOSI) procedure was not activated.

C = controller, P = CISM peer

What exactly happened here? C: Prevailing winds were from the east, and I turned an aircraft onto the ILS too soon. I’d been a bit off with my spacing on final all after-noon, and I wanted to offer a bet-ter service and get closer to the 3 NM we’re supposed to aim for. But my misjudgment here brought the aircraft too close to the preceding traffic.

When were you relieved?C: Shortly after I’d made the mis-take.

What happened next?C: Well, this was the first time I’d violated separation minima, so I was a little confused: I didn’t really know what I had to do next. My colleagues helped me here, and did various things that needed to be done: call the supervisor, have me relieved, activate CISM. They gave me a lot of advice, too: to watch the live recording at the SMC, and secure the data for the

OIR. Then I just waited till the CISM peer arrived.

How long did this take?C: It was about 40 minutes after the incident that he arrived at the CIR.P: Sometimes a CISM peer is already on duty. But they’re only allowed to help in a CISM func-tion if they weren’t involved in the incident themselves: if they were, they’d be too affected to assist. It’s not always easy for the person who’s waiting for a CISM peer to understand why it can take a while

for them to arrive. But the peer may need a peer, too: we can’t help ourselves!

In this particular case there wasn’t a peer in the vicinity, and I got a call from the supervisor saying that someone needed CISM assis-tance. And I set off at once.

How was the discussion with the peer?C: He began by telling me how the discussion would be structured, and asked me if that was OK with me. Then we talked about what exactly had happened: not just the incident itself, but the whole con-text it occurred in– the way I’d been feeling that day, the weather conditions, the breaks I’d had, how I’d felt the day was going, things like that. After that we talked about my reactions and emotions at the time of the inci-dent and after, and the best ways of dealing with these. My peer also gave me some advice on ways and means of alleviating the stress reaction: avoiding unhealthy be- havior, exercise and so on.

What concerned you most about the inci-dent?C: I found it very difficult at first not to feel overwhelmed by all the possible consequences. What would the SAIB and the FOCA

safety bulletin  –  share the experience

What makes CISM so invaluable?

9

make of it? What would my col-leagues think? What would my superiors say? Would I lose my licence completely? P: To have these kind of thoughts after an incident like this is a per-fectly normal reaction of a normal person to an abnormal situation. We tend to be bombarded with thoughts and ideas – some of them completely wrong – and they spin around in our head and make it phenomenally difficult to concentrate and take effective decisions. It’s a kind of mental chaos. And that’s why it’s very important to relieve the person involved of their duties as soon as possible, to reduce the risk of another incident occurring.

How did you find the peer discussion, and how did it help you?C: I found the discussion very helpful, because it gave me the chance to talk about what had happened and how I felt openly, privately and without any judg-ment being passed. And the advice my peer gave me really helped me come to terms with the stress I was feeling. Hearing the outside and more objective view of a col-league with a lot more experience than I have really helped me understand what had happened and its consequences.P: The peer will always start off by establishing what actually hap-

pened and how the victim experi-enced it. This is not a question of finding out who did what incor-rectly: it’s all about establishing the facts. Once that’s been done, we try to explain how the person involved might react on a physical and emotional level, and we offer them some advice on coping with these reactions. What we’re seek-ing to do in all this is bring struc-ture and clarity to the chaos, and help the person to process what they’ve experienced.

When did you start feeling better again, and how long did it take to really get over this incident?C: The peer and the supervisor agreed that I should take the rest of the day off and go home. But I was still a little apprehensive when I came back to work. How would my colleagues treat me? Would they have doubts now about the way I worked? I was very careful during work that day. And I’d say it was about a week before I felt reasonably OK again; but it took a full six weeks for me to regain my confidence.P: As I said, it’s better to relieve the person involved of their duties for the rest of the day. An incident like this can unsettle you for quite a while, and it’s perfectly normal for it to stay on your mind for a few days. That’s actually part of the

natural recovery process, which the peer helps to initiate and sup-port.

When did you go back to work?C: The next day.P: It’s actually important to go back to work as soon as possible. It’s like getting back on a horse straightaway if you’ve fallen off. If you don’t, you may start accumu-lating too many negative thoughts and emotions, and never get back on again.

Would you use the services of a CISM peer again in a similar situation?C: Absolutely! I’ve read a lot about stress management and dealing with problems of this kind. And we also had TRM modules as part of our training where the whole issue was addressed. But when you’re really confronted with a genuine critical situation, all the concepts and the theories tend to fly out the window. And at that moment, the support of a peer is

truly invaluable. In my particular case, and with the help of my peer, I’m convinced that I’ve tangibly improved my working approach, my perception skills and my abil-ity to deal with acute stress. P: I’m personally delighted when-ever I see someone accept our help and really benefit from it. Our CISM team plays a vital role within our company, and it’s greatly appreciated by our col-leagues, too. That gives us a lot of satisfaction in what we do; and it’s why we’ll always be there to help whenever we’re needed.

As the above example shows, it is strongly advisable to talk to a CISM peer if you have been involved in a critical incident. We may know all about stress man-agement and feel that we can han-dle things; but at the crucial moment it may still all be too much. Our thoughts and emo-tions can well up at times like these; and having someone to talk things through with and bring structure and order to the situa-tion can be an invaluable help. That’s what our CISM peers do. And that’s why they – and our Critical Incident Stress Manage-ment – are so important for our company!

Lutz LöfflerCISM Peer

safety bulletin  –  other “hot” stuff

IntroductionOn January 31, 2000, an Alaska Airlines Flight 261, MD83, crash - ed into open water close to Los Angeles airport. The investigation showed that a worn-out jackscrew-nut assembly led to a loose, hori-zontal tail plane and consequently an uncontrollable pitch axis end-ing up in an accident killing 88 people on board.What initially looked as a simple technical problem developed to a much more critical issue during investigation. In short: While the initially prescribed maintenance interval was every 300h (1966), the control and especially the lubricating intervals were step by step extended to about every 2550h (1996). In 1996, the lubri-cation was removed from the A check interval to a so-called task card that specified lubrication every 8 months without any flight hour limit. As the change was introduced for the whole fleet at once, it is not impossible that the concerned aircraft was at the end of the 2550h flying hours when the new procedure was intro-duced, meaning that the interval

might have been much higher than 2550 hours.

Between 300 and 2550h+ there is a long way towards failure. And with hindsight it appears pretty questionable how things could have evolved like that. Therefore it is essential to remind ourselves that each extension made local sense, was only an increment away from the previously established norm, no rules were bent or violated, no laws broken. It was just normal people doing normal work around seemingly normal technology.

Could not happen to us, could it?!Too stupid! This should have been obvious to everybody! How could they be so ignorant to not detect the problem! At least it cannot happen to us, as we are in a com-pletely different business!...

Is our business really so different? Are we really immune against such events? Typical answers onto these questions are for example “we are not in the same territory at all”,

“that happened to them but could not happen to us”, “everything is fine because nothing happened so far” or “no need to worry, as this has nothing to do with drift into failure”. The Alaska Airline management might have given these answers shortly prior to the accident, too. However, such statements are actually parts of the problem itself as they are not answering but rather dealing with or preventing answers to questions regarding a system in a worrying state.

Problems with denying a worrying system stateAny answer going into the direc-tions mentioned above is very human and well comprehensible – but not really feasible in a high-risk environment. In the latter this way of dealing with problems leads to some serious safety-related issues:

1. The system is in a worrying state, otherwise the question would not have been raised.

2. The denial does not refer to the system but to the asking person.

3. The system remains the same without being officially noticed.

4. The reporting person might no longer report as it does not seem to be appreciated.

5. We are in the same territory, and it could happen to us.

A closer look based on an imaginary problemLet us suppose a seemingly un - problematic safety issue is raised in a high reliability organization (HRO). What are the options to tackle it?

• Option 1: Denial – Best option to safe time, efforts and money. What about safety?

It is reasonable to think that prior to the Alaska Airlines acci-dent involved parties such as the regulator, the airline’s manage-ment, MacDonnell-Douglas and others would have or had denied a problem with the lubricating intervals.

What about our own business? As long as there are no major issues with a certain kind of problems, it is quite often consid-ered to be safe enough to leave the

Drift into failure

10

safety bulletin  –  other “hot” stuff

Drift into failure

situation as it is instead of bring-ing it one step ahead. No prob-lems so far – so it is safe. However the mere absence of unwanted outcomes does not tell us too much about the level of safety.

Let us assume an airport/ANSP reports several reported and non-reported approaches with 3.9 or 3.8 nm instead of the legal mini-mum of 4nm because of ATCO’s recurrent misjudgements on the final turn. As the deviance is only about 0.1 to 0.2nm the problem is not considered as relevant, espe-cially because safety would never be endangered as aircraft are fol-lowing each other in line, which is not to be compared to crossing traffic at higher levels for exam-ple. The airport/ANSP declares that it is regarded as a non-event and not necessary to have a report filed by the controller end-ing up with 3.9nm.

Reminder: These were deviances that were neither legal nor offi-cially considered as safe at the time the problems occurred.

• Option 2: Change the rules – Best option to get rid of the par-ticular problem. What about potentially adverse side effects?

At the times it was not only Alaska Airlines but rather every-body flying MD80 who extend - ed maintenance intervals for the jackscrew-nut assembly. Why should Alaska Airlines have refrained from it? Why should they not do the step-by-step increase in lubrication intervals everybody else did? Maybe they even detected the drift and declared it as accepta-ble, which made sense given the circumstances at the time. (In this respect it has to be mentioned, that new rules or adapted procedures may be acceptable, but not the drift itself.)

Let us go back to the reduction of the minimum separation on short final: What happens to the traffic load (allowing more traf- fic an hour than before)? What about taxiways (might be block - ed more frequently due to shorter landing intervals)? If people are so good in doing 3.9nm, why are they not in doing 4nm? Why not going back to 4nm instead of doing expensive, time-consum-ing assess ments for 3.9nm? What is the pressure and where is it coming from?

• Option 3: Reduce the safety buffer – Best option to get rid of the particular problem without spending too many resources, incl. covering regu-latory aspects. But how do you measure it? Where is the line and who draws it?

For sure there were reasons for the initial 300h lubrication interval. After some years of operation it seemed possible and reasonable to extend the lubrication intervals as there were no immediate, subsequent problems directly related to lubrication. But how do you determine the next step? And how do you measure the degree of relation between lubrication and problems arising not quali-fied as lubrication issues?

After some more approaches with 3.9nm the management consid-ers the problem as big enough to react but small enough to not change anything going beyond rules and regulations. After some calculations and assessments the minimum separation is reduced to 3.5nm initially. Everything works out fine and nobody wor-ries about the initial problem anymore.

A few years later, the problem reoccurs with several approaches with 3.4nm. Considering the way the problem was handled years before it seems reasonable to reduce the minimum separation on the ILS to 3nm. Another dec-ade later people decide that 2.5nm works out as – nobody remembers or cares

about the initial 4nm– the respective steps of 0.5nm

made sense at the given time, under the given circumstances

– it is all same direction traffic– traffic volume was generally in-

creasing and – most tempting – – nothing has happened so far.

• Option 4: Changing the way of thinking and/or working – Best option in regard of safety. Hardest option to achieve.

For several reasons Alaska Air-lines did not go for option 4. Unfortunately as a result they had to accept the loss of 88 peo-ple, one aircraft, reputation and a huge amount of money for compensation. Other airlines, manufacturers and regulators could possibly learn from that accident and change their way of thinking and acting.

11

safety bulletin  –  other “hot” stuff

Drift into failure

Although people reported separa-tion minima infringements with 2.4nm as well as safety concerns with the new regulation, the air-port/ANSP has fortunately not yet suffered from a serious inci-dent or accident in connection with the reduced minimum sepa-ration.

But as soon as something would happen, one of the first things striking one’s eye would be the fact that over years the separa-tion on final had been reduced by 1.5nm or 37.5% without any traceable “need” and although the rest of the system (infrastruc-ture, aircraft, pilots, controllers, lighting, etc.) did not change at all.

ConclusionThe figures of drift into failure are easy to draw with hindsight. They are fascinating to look at. The realities they represent, however, were not similarly shown and

aware to those inside the system at the time. For them at the given time and under the given circum-stances everything worked per-fectly fine, otherwise they would have acted and reacted differently. But drift is a fact, going beyond business boundaries, incremen-tally, hidden, silent and unde-tected or, even more worrying, denied: Something erodes to a dangerous state over time without any fundamental change, expos-ing the system to danger.

Actually the only difference bet-ween the Alaska Airlines case and the thought experiment about ATC is the degree we are affected. While in the airliner’s case we said things like “Too stupid!”, “Should have been obvious to everybody!”, “How could they be so ignorant!”, “This cannot happen to us!”, we are wondering in the ATC thought experiment how things had evolved and we are deploring that this could not have foreseen as there were no signals at all!

Yes, there were. Maybe only weak ones, maybe not very feasible at that time, rather disturbing and resource-demanding instead of operationally and financially attractive, but there were...

That is why it is strongly recom-mended to listen to the weak sig-nals, be happy and fortunate to get them, let people know you appreciate their reports (as only interested, dedicated but worried people report), tackle the reports and do not only get rid of the symptoms but solve the prob-lems!

Reto SchorerATCO investigator, SRI

12

(References: S.W.A. Dekker, “Drift into Failure” (2011); K.E.Weick & K.M.Sutcliffe, “Managing the unexpected” (2001/2007))

Sidney Dekker

safety bulletin  –  other “hot” stuff

Daily dilemmaEvery day we at frontends are faced with questions: “How much and how to integrate Human Fac-tors (HF) knowledge and activi-ties into our ATM system design and development?”, or “Maybe it is not so obvious or systematic but we do already include lots of HF in the form of user requirements, end user participation or placing human in the centre. Shouldn’t this be enough?”

When Hans and I cast the similar question to the HF experts re- presenting 39 EUROCONTROL Member States who are attending Safety and Human Performance Subgroup (SHP-SG) meetings, the most evident conclusion we often receive is that, there is a large vari-ety of “how to methods, guide-lines or process which exist among various ANSPs, and there is no “one size fits all answer to this”.

So simply put: everybody does differently - period.

Then, how do we skyguide know if we are doing the right thing or progressing towards our vision of “Placing HF as a central role of

the safe and efficient ATM service provision?”

Hans and his HF colleagues have been actively incorporating HF in their daily business at LVNL for almost 8 years. They are also

confident that they are doing the right thing, but they would also like to know how and what others are doing or to share both success and lessons learnt stories to further improve their HF activ-ities.

As a result and using the opportu-nity given by the EUROCON-TROL SHP-SG to create a com-pendium or short-overview of common “HF in ATM systems”, Hans and I started advocating a “Quick and easy guidance: where

Where are we in the HF ladder?

13

safety bulletin  –  other “hot” stuff

Where are we in the HF ladder?

we are in the HF ladder”. The pur-pose of such guidance is to pro-vide us a mean to discover that there are more levels above you which can support your change process in a more positive way.

HF ladder in design and development as a quick self-checkSo we have learned that there are various levels of HF integration in the ATM system design and devel-opment exist. To have a clear dis-tinction in those different levels of HF maturity, Hans proposed a HF ladder using the similar metaphor after 1Patrick Hudson’s ladder of a Safety Culture.

At the bottom of the ladder, the lowest level of HF integration in organisation can be found. And the top of the ladder, of course, is the ideal integration of HF in sys-tems – where we skyguide also vision to achieve in the future.

The ladder consists of 5 levels from “pathological” to “genera-tive”. The second column right next to the ladder describes the typical organisational reaction or activities when maturity of HF activities is concerned. The ladder

and this corresponding status should be able to give you a clue to where they stand at a particular moment. I have come to some assumptions where we are by looking at this, but will let you judge on your own.

Once self-realisation takes place, the third or last column on right will help you guide what we can do to progress to the next level or ladder step. This list is just the key summary and no way is it com-plete, but it should provide some hints to what major activities should be taken.

Within the ladder, where do known HF methodologies and tools fit?To create one standard overview of existing HF methods and tools is also not an easy task. At LVNL, besides using typical methodologies, tools, guidelines and standards known through the global HF community, they even developed their own in-house methodologies such as 2“HF in ATM system design model based on a competence-based approach” and 3 “the ATCo Cognitive Opera-tion Situation Model (ACoPOS)”

to link the operational change to what it does with cognitive pro-cesses of the ATCO.

The HF awareness facilitator team for the skyguide’s on-going HF awareness training for change man - agers also emphasise several meth-odologies and tools as recommen-dations. For example, we have been testing whether the 4EUROCON-TROL HF Case or 5SESAR’s HP Case could guide us better in plan-ning more systematic HF activities in both short-track changes and projects. We also show various re - commendations, standards, tools and methods to the training par-ticipants as well – with a word of caution, “There are variety of guidance, standards, methodolo-gies and tools being available and each is proven to be very helpful. BUT make sure that you use the right tool for the right job, and

if you get confused, call known HF experts!” (You can find your domain specific local HF leaders at our HF/HP intranet page / HF Local Leaders)

While admitting once again that there is no one size fits all to the HF overview and which methods and tools are recommended to be used at different level of HF matu-rity, majority of the HF expert community agree that below ref-erences are the best so far and worth visit to use as reference as well as to get educated in HF sub-ject matter:

• EUROCONTROL Human Fac-tors Integration in Future ATM Systems (HIFA) http://www.eurocontrol.int/hifa/public/subsite_homepage/homepage.html

• FAA Human Factors Work Bench http://www.hf.faa.gov/Portal/default.aspx

Where do we go from here?Generally spoken, the higher we climb up the ladder, the higher level of SME HF skill and know-how we will need to conduct more complex HF assessment and eval-

14

safety bulletin  –  other “hot” stuff

Where are we in the HF ladder?

uation; along with everybody’s commitment and proactive HF inclusion.

Climbing higher on the ladder serves of course one goal, and that’s filling in the design princi-ple “first time right”. Depending on the complexity of the change during the development, numer-ous design decisions are to be taken. Taking the less optimal or even the wrong design choice means redesign just before intro-duction or even worse repair after going life.

While we endeavour to be safer and more efficient in our changes and projects and to design our ATM system to better fit for end users, including the right level of HF activities in our daily business should be a crucial part to reach our goal. In addition, 6EUROCONTROL

white paper states that, “The future of ATM will depend on how the industry handles a num-ber of critical challenges concern-ing human performance (HP). And there are six key challenges”:1. Designing the right technology

2. Selecting the right people3. Organising the people into the

right roles and responsibilities4. Ensuring that the people have

the right procedures and train-ing

5. Managing HF process at a pro-ject and ANSP level

6. Managing the change and transition process

When we actually list those key challenges as our reminders, all of a sudden, I personally feel very motivated to advocate placing the right amount of our investment

efforts in solving HP challenges or to climb up the HF ladder as high as we could to assist developing the right future ATM system.

What is YOUR opinion about the whole HF ladder and future HP challenges? I would love to hear your opin-ions! Please email [email protected] to share your ideas, opin-ions and feedback!

Keiko MoebusSDE

In collaboration with Hans Huisman, Human Factors TD&M ATC

the Netherlands (LVNL)

15

1 “Implementing a safety culture in a major multi-national”, P. Hudson, Safety Science, volume 45, Issue 6, July 2007, pp 697-722

2 “Integration of HF in ATM System Design – a practical approach and experiences from ATC the Netherlands”, M. I. Roerdink, M. J.Schuver-van Blanken, H. Huisman HF department, ATC the Netherland (LVNL), Paper presented at the 29th EAAP Conference 2010.

3 “The ATCI Cognitive Process & Operational Situation Model – a model for analysing cognitive complexity in ATC”, M.J. Schuver-van Blanken, H. Huisman, M.L. Roerdink HF department, ATC Netherlands (LVNL), paper presented at the 29th EAAP Conference 2010

4 “Support Material for Human Factors Case application”, EUROCONTROL, 2011

5 “HP assessment process for projects in V2 (Feasibility) – Guidance for primary projects”, SESAR Project 16.04.01, 2011

6 “Human Performance in Air Traffic Management Safety – a white paper”, EUROCONTROL/FAA, 2010

safety bulletin  –  information

In this section of the bulletin, you will find, as usual, the statistic of the occurrences for the past 3 months.

Incidents are dealt internally and ATIR are handled over to FOCA and SAIB. Each individual occur-rence receives an action which can be viewed for internal use only in the OIR/SIR monthly publication.

To display those lists, please follow this link: http://skydoc.skyguide.corp/cs.exe/open/8171557. As a reminder, all OIR are confidential and meant for internal use only.

OIR statistics

16

ATIR 63%

INCIDENT 37%

416 occurrences have been reported for the period July-September 2013

Procedure 67%

ACAS 9%

SMI 16%

Accident 1%

IS 2%

Airprox 1% Facility

2%

264 ATIR's have been filed for the period July-September 2013

A/C deviation from ATC clearance 18%

Level bust 6%

Airspace infringement 44%

PLOC 8%

A/C deviation from ATM regulation

13%

RWY incursion 11%

A/C deviation from ATM related equipment carriage

1%

182 procedure occurrences have been reported to FOCA and SAIB

safety bulletin  –  information

Safety Prize

17

descend below FL190, the incident slowly unfolded as the controller turned the aircraft to the right – whereas his mental plan to sepa-rate both aircraft was clearly to turn to the other side. The situa-tion was detected and appropriate correction was done. The STCA worked as it should and the situa-tion was resolved without any loss of separation. In his report, the controller explained the case well and was keen to share lots of infor-mation with the management without hiding his own actions in this. The quick reaction to solve the conflict and the honesty in this report are the reasons why the management elected this report for the prize.Actual status: OGC is currently studying the possibility to isolate the INE monitoring value, when this sector is grouped with either INN or INS in order to be able to apply traffic regulation on INE even if the capacity is under at a global level.

July’ safety prize has been awarded for general concerns about service orders, which are in the opinion of the submitter a very weak link in the safety chain. An example of the SO OG 2013-030E, dated from 20 June 2013, has been cho-sen, which contains important and critical changes, namely the new release boxes between ZRH & GVA ACCs. It was noticed that

August’ safety prize goes to an ZRH ACC ATCO, who submitted a report that addresses a new routing issue. The transit flights from Padova have newly RESIA as entry point possibility, instead of the usual SUXAN. According to FPL data the routing is RESIA Z50 SOPER UN851 ELMUR, which triggers an unexpected right turn of 60 degrees over SOPER. The submitter questions the fact that transit flights are accepted via RESIA, and especially as skyguide was unaware of the plans of Padova to send aircraft via this routing.

Actual status: Currently under analysis of OZC.

As usual in this last part of the bulletin, you will find the last safety prize winner ideas.

In May the management decided not to assign any safety prize because the reports received dur-ing the month were considered not to contribute to safety improve-ment in the way that had been seen from previous months. Therefore, the management preferred not to attribute any prize rather than “giving a prize at any price”!

The June safety improvement idea came from a controller at GVA ACC INI (working with grouped sectors south and east at the time of the incident reported via OIR). While MIL was active on south G5, the controller had to deal with an IFR desiring to leave IFR at VADAR and a slow IFR traffic outbound BRN. The aircraft wanting to leave IFR was not clear about its intentions which resulted in high frequency workload in order to understand the pilot and best accommodate him. While the air-craft concerned did not want to

many ATCOs were not fully aware of the change, despite the infor-mation being available. It is not new that service orders distrib-uted via e-briefing are not easily digestible for ATCOs, however what is new is the constructive proposal to improve the situation: it was proposed that each ATCO receives a formal personal briefing from the acting supervisor when critical or important changes are foreseen or already in place. A SIR has been made out of this pro-posal to assure proper tracking.

Actual status: Currently on pro-gress under OO to bring this issue at a national level.

REMINDER: How to submit your safety improvement idea

Go to the intranet “Safety” homepage and use one of the two well-known existing reporting channels (links at the top of the right side of the page):The OIR (Operational Incident Report) normally used by ATCOs after a reportable incident. OIRs are, of course, a requirement under certain circumstances, and nothing in the Safety Prize changes this requirement.The SIR (Safety Improvement Report), which is meant for anything else that you think is reportable with regards to safety improvement, and may be used by any skyguide employee.

A third method, which should only be used when one of the two methods above is inappropriate, is to:

Send an e-mail to [email protected]. (If a suggestion submitted via this e-mail is considered as an eligible SIR, it will be integrated into the SIR process to ensure follow-up.)