debugging and verification of avionics software
TRANSCRIPT
Ryan Jurgensen RJJ485
Debugging and Verification of Avionics Software
Abstract
Onboard flight computers and avionics software are directly responsible for
hundreds of human lives during every flight. People experience software bugs on a daily
basis - a smartphone application that won’t update, a browser that doesn’t render
correctly - and dismiss these events as routine and mostly inconsequential. In contrast, a
software bug in a flight computer can result in hundreds of people being killed. Despite
some spectacular blemishes, aviation software has a mostly clean reputation and has
never resulted in a human death. However, further work is needed by avionics engineers
to increase the usability of their software while pilots are under the stress of an aviation
emergency. This paper examines the current state of aviation software correctness and
how it is being improved to keep travellers safe.
Introduction
Any routine air traveler has heard the introductory safety briefing performed by
the flight attendants enough times to summarize it as “when the aircraft levels out, you
can turn the iPad on and take the seatbelt off”. Most of the 303 passengers of a Qantas
Airbus A330 in 2008 did exactly this. Their experience with in flight turbulence was
likely limited to a minimal vertical motion and minor vibration. However, on this flight
the main flight control computer failed, resulting in violent dives of the aircraft executed
by the autopilot. 119 passengers who were not wearing their seatbelts were thrown head
first into the ceiling as the plane dived, resulting in several critical injuries. Five minutes
later, the flight computer failed again resulting in the same type nose dive. The flight
eventually made a safe emergency landing and resulted in a 3 year investigation on why
and how the flight computer failed. An edge case software bug within the flight computer
was found to be the issue. Had the plane made the dive at 500 feet rather than 35,000
feet, hundreds would have been killed. Qantas flight 72 is a reminder of how travellers
entrust their lives to flight computer software every flight and that software engineers
have the monumental task of ensuring the predictable operation of these systems.
Computer Assisted Flight
Starting the late 1970’s, aircraft became more dependent on onboard computers. In
newer aircraft, every decision and action the pilot makes in fed into the flight computer,
which then move the appropriate parts of the plane such as the elevators or rudder. Prior
to “fly-by-wire” systems, moving a control device in the cockpit would result in a
mechanical movement (either cable or hydraulic) that would physically move the aircraft
control surfaces, like the braking system in an older car without ABS. Fly-by-wire
systems have disconnected the pilot from direct access to the plane, and sometimes, the
flight computer may adjust the pilot’s commands or ignore them completely if the flight
computer believes them to be unsafe. In 1984 the Airbus A320 was the first commercial
aircraft to be completely fly-by-wire, and the General Dynamics F-16 in 1978 was the
first fly-by-wire production military aircraft.
The benefits of computer assisted flight are enormous. In the commercial airliner,
software logic can be created to avoid the pilot from making mistakes, such as pushing
the aircraft outside its operational limits. Benefits in military aircraft were even greater
because flight computers could make microsecond adjustments to the flight surfaces to
attain maneuverability that could not be achieved traditionally. For many contemporary
high performance military aircraft, the aircraft is inherently unstable and would not be
able to be flown by a human directly. The F-117 and B-2 stealth aircraft are extreme
examples of aircraft whose shapes are so unairworthy that advanced flight computer
micro corrections are the only way they can fly.
Avionics programming
Computer assisted flight was created during a time when the Ada programming
language dominated Department of Defense programming projects. Designed for real-
time and embedded applications, significant effort went into the verification of Ada
compilers to strive for predictability in mission-critical applications. Ada’s reign in
avionics continues into the 21st century, where flight computers are still written in Ada
for its predictability.. The newest military aircraft for the United States, the Joint Strike
Fighter, was the first high profile departure from Ada, where a design decision was made
to use C++ for a significant part of the onboard flight computers. A 300 page design
document was released with the direct involvement of the language creator Bjarne
Stroustrup to create a C++ coding standard for real-time flight systems.
Failures
The part of the A330 flight computer that failed in 2008 was an Air Data Inertial
Reference Unit (ADIRU) made by Northrop Grumman. The ADIRU is responsible for
collecting airspeed, angle of attack and altitude information from mechanical sensors and
communicate this data with the rest of the flight computer system. The A330 was
equipped with 3 ADIRUs for redundancy and reliability, each collecting their own data
sent to the flight control system. Normally, all three ADIRUs collect information and
send them to the primary flight computer. If two of the ADIRUs report that something is
incorrect, such as an airspeed that may indicate a aerodynamic stall, the flight computer
will adjust accordingly. In the situation of too low of an airspeed, the flight computer
adjusts the nose downward to gain airspeed and recover from the stall. If the ADIRUs
reported values that varied significantly, the flight computer algorithm would ignore their
inputs for 1.2 seconds, attributing the extreme values as outliers. For Qantas flight 72,
two ADIRUs reported extreme data for the aircraft’s angle of attack (what direction the
nose is pointing) and the flight computer disregarded their input for 1.2 seconds. Exactly
1.2 seconds later, the flight computer went to check the ADIRUs again and still found
extreme data. This is where the failure occurred - no software engineers planned for the
situation where ADIRU would be incorrect after the 1.2 second waiting period. The
primary flight computer entered a state of unpredictability and erroneously believed that
the aircraft was pointed dangerously upwards, even though the aircraft was really flying
completely horizontal and stable at 35,000 feet. The primary flight computer made an
autonomous decision to dive the nose down to correct the erroneous angle. This happened
twice during the flight, resulting in 650 and 400 feet dives within seconds, throwing
passengers upwards in the cabin. Although the software engineers claim that through
testing was made of the ADIRUs, they had not anticipated the frequent spikes in
erroneous data. How could have the engineers predict and account for such a complex -
both logically and temporally - series of failures?
The Avionics Standard - DO-178B
RTCA, a US non-profit organization focusing on airworthy software created specification
DO-178B in 1992 to document best practices when developing avionics software. DO-
178B was adopted by the FAA in 1993 and remains a de facto, but optional, guidelines
for avionics development. DO-178B outlines the entire software development process for
avionics, starting with recommendations for specification documents, best practices
within source code, verification procedures, configuration, and QA.
DO-178B classifies and makes recommendation based on the failure condition of the
particular module. The more severe the failure condition, the most strictly a module will
be tested and a higher standard applied.
• Catastrophic (Level A) - Failure may cause a crash. Error or loss of critical
function required to safely fly and land the aircraft. The level has 66 requirements
that must be met.
• Hazardous (Level B) - Failure has a large negative impact on safety or
performance, or reduces the ability of the crew to operate the aircraft due to
physical distress or a higher workload, or causes serious or fatal injuries among
the passengers. The level has 65 requirements that must be met.
• Major (Level C) - Failure is significant, but has a lesser impact than a Hazardous
failure (for example, leads to passenger discomfort rather than injuries) or
significantly increases crew workload. The level has 57 requirements that must be
met.
• Minor (Level D) - Failure is noticeable, but has a lesser impact than a Major
failure (for example, causing passenger inconvenience or a routine flight plan
change). The level has 28 requirements that must be met.
• No Effect (Level E) - Failure has no impact on safety, aircraft operation, or crew
workload. The level has 0 requirements that must be met.
For example of the design levels, a primary flight computer (level A) will be held to
higher standards and scrutiny than a in flight entertainment system (level E). A level A
system that complies with DO-178B will conform to the best practices of design, coding,
testing, documentation, QA, and configuration. 20 years after the release of DO-178B,
the new specification, DO-178C was released in 2012 and is currently being evaluated by
the FAA to be the next guideline document for the avionics industry. The improvements
and additions to DO-178C indicate the progress that has occurred in avionics
programming in the last 20 years – the new specification addresses issues such as model
based development and verification, object oriented programming and formal methods.
Current Formal Methods of Avionics Verification
Harve Delseny, head of software process definition at Airbus, presented in 2010
the suite of formal methods that the aircraft manufacturer uses to verify their avionics.
The verification process at Airbus is divided into 5 primary components:
• Verifying the worst case execution time (WCET)
• Verifying stack consumption
• Verifying the precision of floating-point calculus
• Verification of absence of run time errors
• Unit proofs (ala unit tests) of functional properties
For the Airbus A380, the largest commercial aircraft flying today, the following
process was followed to verify the avionics components. First, the software tool aiT was
used to verify WCET. According to the creators of aiT, “aiT WCET Analyzers provide
the solution to these problems [real time WCET assurance]: they statically analyze a
task’s intrinsic cache and pipeline behavior based on formal cache and pipeline models.
This enables correct and tight upper bounds to be computed for the worst-case execution
time.”
Airbus engineers then run all avionics through the Stackanalyzer tool, produced by
the same company as aiT. This tool reconstructs all control paths directly from the binary
code, simulating and preventing runtime errors due to stack overflow. After stack
overflow analysis, Airbus software is processed by the Fluctuat tool, which verifies the
precision of floating point calculus calculated by the flight computer. Initially developed
by the French nuclear research center, this simulation analyses the avionics assembly
code and verifies that its results from floating point calculus matches known values and
that rounding errors are within an acceptable margin. The Astree software for proving the
absence of run time errors then processes Airbus avionics. According to Astree
developers and researchers, their software can prove C code is free of:
• Division by zero
• Out of bounds array indexing
• Erroneous pointer manipulation and derefrencing
• Arithmetic overflows
• Assertion violations
• Unreachable code
• Read access of uninitialized variables
According to AbsInt, the software company that build Astree, “In November 2003,
Astrée proved the absence of any real-time errors in the primary flight-control software
of one of Airbus’ models. The analysis was performed completely automatically. The
system’s 132,000 lines of C code were analyzed in only 80 minutes on a 2.8GHz 32-bit
PC using 300MB of memory (and in only 50 minutes on an AMD Athlon 64 using
580MB of memory).”
Finally, Airbus avionics are supplemented with a suite of unit proofs. Delseny
stated in his 2010 presentation that unit proofs are used in the development process at
Airbus, but did not mention how they compare or interact with their unit testing
procedures. Unit proofs are executed by the software tool Caveat (also designed at the
French nuclear research center) and is based on weakest precondition rules injected into
C code. This completes the 5-stage software verification suite at Airbus. Delseny claims
that this process has proved to be more cost effective and produced a higher quality of
avionics software compared to traditional testing processes.
Model checking for avionics
Traditional software development uses tests and inspections to prevent software
errors. In the Airbus A330 ADIRU programming, engineers admitted there was no test or
inspection that they made that found the bug that caused the Qantas incident. Efforts to
introduce model checking into avionics software have started and initial results are
promising. In an experiment by researcher Darren Cofer, a flight computer was examined
using traditional software development techniques and by a model checking based
approach. “Analysis of an early specification of the mode logic found 26 errors.
Seventeen of these were found by the model checker. Of these 17 errors, 13 were
classified by the FCS 5000 engineers as being possible to be missed by classified as
being unlikely to have been found by traditional techniques.”
User Error
According to the National Transportation Safety Board, 85% of aviation accidents
have been due to pilot error in the last 20 years. Conversely, avionics software has never
resulted in a fatality. One way that avionics software can help reduce the amount of
fatalities due to pilot error is to improve the user interface (UI) of aircraft cockpits. Flight
systems are displaying an incredible amount of information every second to the pilot,
who in an emergency is likely acting on instinct and in a state of reduced decision making
ability. The problem is a highly complex computer system that is trying to tell everything
about the system to a pilot who is beyond mental capacity. The solution is creating user
interfaces that state the status of the system as clear and intuitively as possible.
The crash of Air France 447 (AF447) in 2009 is the pinnacle example of the
changes that need to happen in cockpit UI design. When AF447 disappeared over the
mid-Atlantic in the middle of the night heading to Paris from Brazil, accident
investigators attributed the crash to bad weather that resulted in the icing of the air speed
sensor and subsequent instrument failure. The cockpit data recorder was miraculously
found at the bottom of the ocean in 2011 and explained in detail what really happened on
AF447.
Cockpit data and voice recordings found that the A330 encountered bad storms at
2am. The external air speed sensor became covered in a layer of ice and stopped
transmitting air speed data, just as investigators has predicted. As soon as the flight
computer stopped receiving air speed data, two things happened: the autopilot was
disabled, and the flight computer entered a reduced functionality mode called alternate
law. When the flight computer is operating in alternate law, the pilots’ inputs are less
processed by the computer before being sent to the control surfaces. As such, a pilot input
that would otherwise be rejected by the flight computer is now accepted because the
flight computer does not have the data it needs (airspeed) to make these decisions. The
pilots are now flying the aircraft more directly than they are used to and is the first user
interface failure on this flight.
In a human error that is unexplainable by aviation experts, the rookie pilot starts
pulling back on the side stick, which raises the nose of the plane into the air 10 minutes
after entering the storm system. At this point, the aircraft is completely flyable and could
have easily exited the storm system safely. Despite this, the rookie pilot continues to
irrationally pull back on the stick until the plane is pitched 18 degrees upwards and
eventually stalls. The aircraft has been pushed beyond its limits and the wings are no
longer generating lift. The aircraft is now at 37,000 feet and falling like a rock towards
the ocean at 10,000 feet per minute. If at any time the rookie pilot had let go of the stick,
the plane would have leveled out and the flight would have been saved.
During the 4-minute free fall, the 3 pilots were in the cockpit trying to figure out
why the plane was falling towards the ocean. The two other pilots did not notice that the
rookie pilot was pulling the stick all the way back. This is the second user interface
failure in this flight – there was no feedback to the left seat pilot about what the right seat
pilot was doing. Seconds before the aircraft hits the water, one of the other pilots noticed
the rookie pilots hand on the stick was all the way back and tells him to stop. The rookie
pilot responded “But what's happening?” 4 seconds before the aircraft hit the water.
Entering alternate law was the first UI failure in the Airbus A330. There was no
aural warning to the pilots that they were no longer being checked by the flight computer
– only small labels in two of their many displays indicated the flight computer was in
alternate law. If the pilots did not know the aircraft was in alternate law, they would
assume that the flight computer was still preventing them from stalling the aircraft.
The second UI failure was the failure to communicate inputs between the pilots.
The left seat pilot did not know that the right seat pilot was pulling the plane upwards. In
a Boeing aircraft (as opposed to AF447’s Airbus), the two control sticks the pilots use are
physically connected to each other. Had AF447 been a Boeing aircraft, the left seat pilot
would feel the right seat pilot pulling up on the stick and would have acted accordingly.
The primary error in AF447 would not have happened in a Boeing aircraft. Airbus could
have resulted this UI problem by either introducing simulated resistance on both pilots’
side sticks to tactilely demonstrate what the other pilot is doing, or could present a visual
or auditory alert when the two pilots are entering different flight commands on their
sticks. The AF447 crash could have been prevented had the Airbus UI clearly explained
the discrepancy in flight inputs. The clean track record of avionics software excellence
need to include cockpit ergonomics and user interfaces in order to cut down on the
primary reason for most plane crashes – human error.
Conclusion
No flight computer error has caused a crash that directly results in human loss of
life, although there have been some perilous close calls. The debugging and verification
of the avionics systems travelers trust their life with is a massively important problem.
Avionics manufacturers currently use entire suites of formal methods to analyze source
and machine code as much as possible. Although the field is still being evaluated, model
checking is being tested and initial results in catching unpredictable behavior are
promising. However, the crash of Air France 447 indicates that much work is needed in
transmitting system status to the pilots, who are often the cause of fatal accidents.
Works Cited
"A380 Family." A380-800. Airbus, 30 Mar. 2012. Web. 17 Apr. 2012.
<http://www.airbus.com/aircraftfamilies/passengeraircraft/a380family/>.
"AbsInt Angewandte Informatik GmbH, Saarbracken." AbsInt: Analysis Tools for
Embedded Systems. Web. 17 Apr. 2012. <http://www.absint.com/>.
AT&T. JOINT STRIKE FIGHTER AIR VEHICLE C++ CODING STANDARDS. Rep.
no. 2RDU00001 Rev C. 2005. Print.
Australia. Australian Transport Safety Bureau. Qantas Airbus A330 Accident Media
Conference. 2008. Print.
Cofer, D., M. Whalen, and S. Miller. "Software Model Checking for Avionics Systems."
Digital Avionics Systems Conference IEEE/AIAA 27th (2008): 5-1--8. Print.
Delseny, Harve. "Formal Method for Avionics Software Verification." Speech.
France. Ministère De L’écologie, Du Développement Durable, Des Transports Et Du
Logement. Bureau D’Enquêtes Et D’Analyses Pour La Sécurité De L’aviation
Civile. Interim Report N°3 On the Accident on 1st June 2009 to the Airbus A330-
203 Registered F-GZCP Operated by Air France Flight AF 447 Rio De Janeiro -
Paris. Print.
Frawley, Gerard. The International Directory of Military Aircraft 2002/03. Fishwick,
ACT: Aerospace Publications, 2002. Print.
Heasley, Andrew. "Qantas Terror Blamed on Computer." Qantas Terror Blamed on
Computer. 19 Dec. 2011. Web. 29 Mar. 2012.
<http://www.stuff.co.nz/travel/travel-troubles/6163633/Qantas-terror-blamed-on-
computer>.
Howard, Courtney. "Safety- and Security-critical Avionics Software." Safety- and
Security-critical Avionics Software. Military & Aerospace Avionics, 1 Feb. 2011.
Web. 20 Mar. 2012. <http://www.militaryaerospace.com/articles/print/volume-
22/issue-2/technology-focus/safety-and-security-critical-avionics-software.html>.
Moir, I., and A. G. Seabridge. Civil Avionics Systems. Reston, VA: American Institute of
Aeronautics and Astronautics, 2003. Print.
Otelli, Jean-Pierre. Erreurs De Pilotage : Tome 5 [Broché]. Web. 01 May 2012.
<http://www.amazon.co.uk/Erreurs-pilotage-Tome-Jean-Pierre-
Otelli/dp/B0050SQ6UA>.
Romanski, George. "Ada in the Avionics Industry." ACM SIGAda Ada Letters XXV.4
(2005): 109-14. Print.
Rosenberg, Barry. "Product Focus: Software." Avionics Magazine. Avionics Today, 1
Aug. 2010. Web. 21 Mar. 2012.
<http://www.aviationtoday.com/av/issue/feature/Product-Focus-
Software_70310.html>.
United States. US Department of Transportation. Federal Aviation Administration.
RTCA, Inc., Document RTCA/DO-I 78B. Print.
Wlad, Joseph. "DO-1788B and Safety-Critical Software." Wind River Technical
Overview. CA, Alameda. 17 Apr. 2012. Lecture.