debugging and verification of avionics software

Ryan Jurgensen RJJ485

Debugging and Verification of Avionics Software

Abstract

Onboard flight computers and avionics software are directly responsible for

hundreds of human lives during every flight. People experience software bugs on a daily

basis - a smartphone application that won’t update, a browser that doesn’t render

correctly - and dismiss these events as routine and mostly inconsequential. In contrast, a

software bug in a flight computer can result in hundreds of people being killed. Despite

some spectacular blemishes, aviation software has a mostly clean reputation and has

never resulted in a human death. However, further work is needed by avionics engineers

to increase the usability of their software while pilots are under the stress of an aviation

emergency. This paper examines the current state of aviation software correctness and

how it is being improved to keep travellers safe.

Introduction

Any routine air traveler has heard the introductory safety briefing performed by

the flight attendants enough times to summarize it as “when the aircraft levels out, you

can turn the iPad on and take the seatbelt off”. Most of the 303 passengers of a Qantas

Airbus A330 in 2008 did exactly this. Their experience with in flight turbulence was

likely limited to a minimal vertical motion and minor vibration. However, on this flight

the main flight control computer failed, resulting in violent dives of the aircraft executed

by the autopilot. 119 passengers who were not wearing their seatbelts were thrown head

first into the ceiling as the plane dived, resulting in several critical injuries. Five minutes

later, the flight computer failed again resulting in the same type nose dive. The flight

eventually made a safe emergency landing and resulted in a 3 year investigation on why

and how the flight computer failed. An edge case software bug within the flight computer

was found to be the issue. Had the plane made the dive at 500 feet rather than 35,000

feet, hundreds would have been killed. Qantas flight 72 is a reminder of how travellers

entrust their lives to flight computer software every flight and that software engineers

have the monumental task of ensuring the predictable operation of these systems.

Computer Assisted Flight

Starting the late 1970’s, aircraft became more dependent on onboard computers. In

newer aircraft, every decision and action the pilot makes in fed into the flight computer,

which then move the appropriate parts of the plane such as the elevators or rudder. Prior

to “fly-by-wire” systems, moving a control device in the cockpit would result in a

mechanical movement (either cable or hydraulic) that would physically move the aircraft

control surfaces, like the braking system in an older car without ABS. Fly-by-wire

systems have disconnected the pilot from direct access to the plane, and sometimes, the

flight computer may adjust the pilot’s commands or ignore them completely if the flight

computer believes them to be unsafe. In 1984 the Airbus A320 was the first commercial

aircraft to be completely fly-by-wire, and the General Dynamics F-16 in 1978 was the

first fly-by-wire production military aircraft.

The benefits of computer assisted flight are enormous. In the commercial airliner,

software logic can be created to avoid the pilot from making mistakes, such as pushing

the aircraft outside its operational limits. Benefits in military aircraft were even greater

because flight computers could make microsecond adjustments to the flight surfaces to

attain maneuverability that could not be achieved traditionally. For many contemporary

high performance military aircraft, the aircraft is inherently unstable and would not be

able to be flown by a human directly. The F-117 and B-2 stealth aircraft are extreme

examples of aircraft whose shapes are so unairworthy that advanced flight computer

micro corrections are the only way they can fly.

Avionics programming

Computer assisted flight was created during a time when the Ada programming

language dominated Department of Defense programming projects. Designed for real-

time and embedded applications, significant effort went into the verification of Ada

compilers to strive for predictability in mission-critical applications. Ada’s reign in

avionics continues into the 21st century, where flight computers are still written in Ada

for its predictability.. The newest military aircraft for the United States, the Joint Strike

Fighter, was the first high profile departure from Ada, where a design decision was made

to use C++ for a significant part of the onboard flight computers. A 300 page design

document was released with the direct involvement of the language creator Bjarne

Stroustrup to create a C++ coding standard for real-time flight systems.

Failures

The part of the A330 flight computer that failed in 2008 was an Air Data Inertial

Reference Unit (ADIRU) made by Northrop Grumman. The ADIRU is responsible for

collecting airspeed, angle of attack and altitude information from mechanical sensors and

communicate this data with the rest of the flight computer system. The A330 was

equipped with 3 ADIRUs for redundancy and reliability, each collecting their own data

sent to the flight control system. Normally, all three ADIRUs collect information and

send them to the primary flight computer. If two of the ADIRUs report that something is

incorrect, such as an airspeed that may indicate a aerodynamic stall, the flight computer

will adjust accordingly. In the situation of too low of an airspeed, the flight computer

adjusts the nose downward to gain airspeed and recover from the stall. If the ADIRUs

reported values that varied significantly, the flight computer algorithm would ignore their

inputs for 1.2 seconds, attributing the extreme values as outliers. For Qantas flight 72,

two ADIRUs reported extreme data for the aircraft’s angle of attack (what direction the

nose is pointing) and the flight computer disregarded their input for 1.2 seconds. Exactly

1.2 seconds later, the flight computer went to check the ADIRUs again and still found

extreme data. This is where the failure occurred - no software engineers planned for the

situation where ADIRU would be incorrect after the 1.2 second waiting period. The

primary flight computer entered a state of unpredictability and erroneously believed that

the aircraft was pointed dangerously upwards, even though the aircraft was really flying

completely horizontal and stable at 35,000 feet. The primary flight computer made an

autonomous decision to dive the nose down to correct the erroneous angle. This happened

twice during the flight, resulting in 650 and 400 feet dives within seconds, throwing

passengers upwards in the cabin. Although the software engineers claim that through

testing was made of the ADIRUs, they had not anticipated the frequent spikes in

erroneous data. How could have the engineers predict and account for such a complex -

both logically and temporally - series of failures?

The Avionics Standard - DO-178B

RTCA, a US non-profit organization focusing on airworthy software created specification

DO-178B in 1992 to document best practices when developing avionics software. DO-

178B was adopted by the FAA in 1993 and remains a de facto, but optional, guidelines

for avionics development. DO-178B outlines the entire software development process for

avionics, starting with recommendations for specification documents, best practices

within source code, verification procedures, configuration, and QA.

DO-178B classifies and makes recommendation based on the failure condition of the

particular module. The more severe the failure condition, the most strictly a module will

be tested and a higher standard applied.

• Catastrophic (Level A) - Failure may cause a crash. Error or loss of critical

function required to safely fly and land the aircraft. The level has 66 requirements

that must be met.

• Hazardous (Level B) - Failure has a large negative impact on safety or

performance, or reduces the ability of the crew to operate the aircraft due to

physical distress or a higher workload, or causes serious or fatal injuries among

the passengers. The level has 65 requirements that must be met.

• Major (Level C) - Failure is significant, but has a lesser impact than a Hazardous

failure (for example, leads to passenger discomfort rather than injuries) or

significantly increases crew workload. The level has 57 requirements that must be

met.

• Minor (Level D) - Failure is noticeable, but has a lesser impact than a Major

failure (for example, causing passenger inconvenience or a routine flight plan

change). The level has 28 requirements that must be met.

• No Effect (Level E) - Failure has no impact on safety, aircraft operation, or crew

workload. The level has 0 requirements that must be met.

For example of the design levels, a primary flight computer (level A) will be held to

higher standards and scrutiny than a in flight entertainment system (level E). A level A

system that complies with DO-178B will conform to the best practices of design, coding,

testing, documentation, QA, and configuration. 20 years after the release of DO-178B,

the new specification, DO-178C was released in 2012 and is currently being evaluated by

the FAA to be the next guideline document for the avionics industry. The improvements

and additions to DO-178C indicate the progress that has occurred in avionics

programming in the last 20 years – the new specification addresses issues such as model

based development and verification, object oriented programming and formal methods.

Current Formal Methods of Avionics Verification

Harve Delseny, head of software process definition at Airbus, presented in 2010

the suite of formal methods that the aircraft manufacturer uses to verify their avionics.

The verification process at Airbus is divided into 5 primary components:

• Verifying the worst case execution time (WCET)

• Verifying stack consumption

• Verifying the precision of floating-point calculus

• Verification of absence of run time errors

• Unit proofs (ala unit tests) of functional properties

For the Airbus A380, the largest commercial aircraft flying today, the following

process was followed to verify the avionics components. First, the software tool aiT was

used to verify WCET. According to the creators of aiT, “aiT WCET Analyzers provide

the solution to these problems [real time WCET assurance]: they statically analyze a

task’s intrinsic cache and pipeline behavior based on formal cache and pipeline models.

This enables correct and tight upper bounds to be computed for the worst-case execution

time.”

Airbus engineers then run all avionics through the Stackanalyzer tool, produced by

the same company as aiT. This tool reconstructs all control paths directly from the binary

code, simulating and preventing runtime errors due to stack overflow. After stack

overflow analysis, Airbus software is processed by the Fluctuat tool, which verifies the

precision of floating point calculus calculated by the flight computer. Initially developed

by the French nuclear research center, this simulation analyses the avionics assembly

code and verifies that its results from floating point calculus matches known values and

that rounding errors are within an acceptable margin. The Astree software for proving the

absence of run time errors then processes Airbus avionics. According to Astree

developers and researchers, their software can prove C code is free of:

• Division by zero

• Out of bounds array indexing

• Erroneous pointer manipulation and derefrencing

• Arithmetic overflows

• Assertion violations

• Unreachable code

• Read access of uninitialized variables

According to AbsInt, the software company that build Astree, “In November 2003,

Astrée proved the absence of any real-time errors in the primary flight-control software

of one of Airbus’ models. The analysis was performed completely automatically. The

system’s 132,000 lines of C code were analyzed in only 80 minutes on a 2.8GHz 32-bit

PC using 300MB of memory (and in only 50 minutes on an AMD Athlon 64 using

580MB of memory).”

Finally, Airbus avionics are supplemented with a suite of unit proofs. Delseny

stated in his 2010 presentation that unit proofs are used in the development process at

Airbus, but did not mention how they compare or interact with their unit testing

procedures. Unit proofs are executed by the software tool Caveat (also designed at the

French nuclear research center) and is based on weakest precondition rules injected into

C code. This completes the 5-stage software verification suite at Airbus. Delseny claims

that this process has proved to be more cost effective and produced a higher quality of

avionics software compared to traditional testing processes.

Model checking for avionics

Traditional software development uses tests and inspections to prevent software

errors. In the Airbus A330 ADIRU programming, engineers admitted there was no test or

inspection that they made that found the bug that caused the Qantas incident. Efforts to

introduce model checking into avionics software have started and initial results are

promising. In an experiment by researcher Darren Cofer, a flight computer was examined

using traditional software development techniques and by a model checking based

approach. “Analysis of an early specification of the mode logic found 26 errors.

Seventeen of these were found by the model checker. Of these 17 errors, 13 were

classified by the FCS 5000 engineers as being possible to be missed by classified as

being unlikely to have been found by traditional techniques.”

User Error

According to the National Transportation Safety Board, 85% of aviation accidents

have been due to pilot error in the last 20 years. Conversely, avionics software has never

resulted in a fatality. One way that avionics software can help reduce the amount of

fatalities due to pilot error is to improve the user interface (UI) of aircraft cockpits. Flight

systems are displaying an incredible amount of information every second to the pilot,

who in an emergency is likely acting on instinct and in a state of reduced decision making

ability. The problem is a highly complex computer system that is trying to tell everything

about the system to a pilot who is beyond mental capacity. The solution is creating user

interfaces that state the status of the system as clear and intuitively as possible.

The crash of Air France 447 (AF447) in 2009 is the pinnacle example of the

changes that need to happen in cockpit UI design. When AF447 disappeared over the

mid-Atlantic in the middle of the night heading to Paris from Brazil, accident

investigators attributed the crash to bad weather that resulted in the icing of the air speed

sensor and subsequent instrument failure. The cockpit data recorder was miraculously

found at the bottom of the ocean in 2011 and explained in detail what really happened on

AF447.

Cockpit data and voice recordings found that the A330 encountered bad storms at

2am. The external air speed sensor became covered in a layer of ice and stopped

transmitting air speed data, just as investigators has predicted. As soon as the flight

computer stopped receiving air speed data, two things happened: the autopilot was

disabled, and the flight computer entered a reduced functionality mode called alternate

law. When the flight computer is operating in alternate law, the pilots’ inputs are less

processed by the computer before being sent to the control surfaces. As such, a pilot input

that would otherwise be rejected by the flight computer is now accepted because the

flight computer does not have the data it needs (airspeed) to make these decisions. The

pilots are now flying the aircraft more directly than they are used to and is the first user

interface failure on this flight.

In a human error that is unexplainable by aviation experts, the rookie pilot starts

pulling back on the side stick, which raises the nose of the plane into the air 10 minutes

after entering the storm system. At this point, the aircraft is completely flyable and could

have easily exited the storm system safely. Despite this, the rookie pilot continues to

irrationally pull back on the stick until the plane is pitched 18 degrees upwards and

eventually stalls. The aircraft has been pushed beyond its limits and the wings are no

longer generating lift. The aircraft is now at 37,000 feet and falling like a rock towards

the ocean at 10,000 feet per minute. If at any time the rookie pilot had let go of the stick,

the plane would have leveled out and the flight would have been saved.

During the 4-minute free fall, the 3 pilots were in the cockpit trying to figure out

why the plane was falling towards the ocean. The two other pilots did not notice that the

rookie pilot was pulling the stick all the way back. This is the second user interface

failure in this flight – there was no feedback to the left seat pilot about what the right seat

pilot was doing. Seconds before the aircraft hits the water, one of the other pilots noticed

the rookie pilots hand on the stick was all the way back and tells him to stop. The rookie

pilot responded “But what's happening?” 4 seconds before the aircraft hit the water.

Entering alternate law was the first UI failure in the Airbus A330. There was no

aural warning to the pilots that they were no longer being checked by the flight computer

– only small labels in two of their many displays indicated the flight computer was in

alternate law. If the pilots did not know the aircraft was in alternate law, they would

assume that the flight computer was still preventing them from stalling the aircraft.

The second UI failure was the failure to communicate inputs between the pilots.

The left seat pilot did not know that the right seat pilot was pulling the plane upwards. In

a Boeing aircraft (as opposed to AF447’s Airbus), the two control sticks the pilots use are

physically connected to each other. Had AF447 been a Boeing aircraft, the left seat pilot

would feel the right seat pilot pulling up on the stick and would have acted accordingly.

The primary error in AF447 would not have happened in a Boeing aircraft. Airbus could

have resulted this UI problem by either introducing simulated resistance on both pilots’

side sticks to tactilely demonstrate what the other pilot is doing, or could present a visual

or auditory alert when the two pilots are entering different flight commands on their

sticks. The AF447 crash could have been prevented had the Airbus UI clearly explained

the discrepancy in flight inputs. The clean track record of avionics software excellence

need to include cockpit ergonomics and user interfaces in order to cut down on the

primary reason for most plane crashes – human error.

Conclusion

No flight computer error has caused a crash that directly results in human loss of

life, although there have been some perilous close calls. The debugging and verification

of the avionics systems travelers trust their life with is a massively important problem.

Avionics manufacturers currently use entire suites of formal methods to analyze source

and machine code as much as possible. Although the field is still being evaluated, model

checking is being tested and initial results in catching unpredictable behavior are

promising. However, the crash of Air France 447 indicates that much work is needed in

transmitting system status to the pilots, who are often the cause of fatal accidents.

Works Cited

"A380 Family." A380-800. Airbus, 30 Mar. 2012. Web. 17 Apr. 2012.

<http://www.airbus.com/aircraftfamilies/passengeraircraft/a380family/>.

"AbsInt Angewandte Informatik GmbH, Saarbracken." AbsInt: Analysis Tools for

Embedded Systems. Web. 17 Apr. 2012. <http://www.absint.com/>.

AT&T. JOINT STRIKE FIGHTER AIR VEHICLE C++ CODING STANDARDS. Rep.

no. 2RDU00001 Rev C. 2005. Print.

Australia. Australian Transport Safety Bureau. Qantas Airbus A330 Accident Media

Conference. 2008. Print.

Cofer, D., M. Whalen, and S. Miller. "Software Model Checking for Avionics Systems."

Digital Avionics Systems Conference IEEE/AIAA 27th (2008): 5-1--8. Print.

Delseny, Harve. "Formal Method for Avionics Software Verification." Speech.

France. Ministère De L’écologie, Du Développement Durable, Des Transports Et Du

Logement. Bureau D’Enquêtes Et D’Analyses Pour La Sécurité De L’aviation

Civile. Interim Report N°3 On the Accident on 1st June 2009 to the Airbus A330-

203 Registered F-GZCP Operated by Air France Flight AF 447 Rio De Janeiro -

Paris. Print.

Frawley, Gerard. The International Directory of Military Aircraft 2002/03. Fishwick,

ACT: Aerospace Publications, 2002. Print.

Heasley, Andrew. "Qantas Terror Blamed on Computer." Qantas Terror Blamed on

Computer. 19 Dec. 2011. Web. 29 Mar. 2012.

<http://www.stuff.co.nz/travel/travel-troubles/6163633/Qantas-terror-blamed-on-

computer>.

Howard, Courtney. "Safety- and Security-critical Avionics Software." Safety- and

Security-critical Avionics Software. Military & Aerospace Avionics, 1 Feb. 2011.

Web. 20 Mar. 2012. <http://www.militaryaerospace.com/articles/print/volume-

22/issue-2/technology-focus/safety-and-security-critical-avionics-software.html>.

Moir, I., and A. G. Seabridge. Civil Avionics Systems. Reston, VA: American Institute of

Aeronautics and Astronautics, 2003. Print.

Otelli, Jean-Pierre. Erreurs De Pilotage : Tome 5 [Broché]. Web. 01 May 2012.

<http://www.amazon.co.uk/Erreurs-pilotage-Tome-Jean-Pierre-

Otelli/dp/B0050SQ6UA>.

Romanski, George. "Ada in the Avionics Industry." ACM SIGAda Ada Letters XXV.4

(2005): 109-14. Print.

Rosenberg, Barry. "Product Focus: Software." Avionics Magazine. Avionics Today, 1

Aug. 2010. Web. 21 Mar. 2012.

<http://www.aviationtoday.com/av/issue/feature/Product-Focus-

Software_70310.html>.

United States. US Department of Transportation. Federal Aviation Administration.

RTCA, Inc., Document RTCA/DO-I 78B. Print.

Wlad, Joseph. "DO-1788B and Safety-Critical Software." Wind River Technical

Overview. CA, Alameda. 17 Apr. 2012. Lecture.

debugging and verification of avionics software

Documents