on nasa space shuttle program hardware and software
TRANSCRIPT
Challenge to Endeavour Discovery of Atlantis in Columbia On hardware and software used in NASA Space Shuttle Program
HP Service Virtualization, Prague, September 23rd 2015, Martin Dvorak
Change of the motivation, value and complexityWhy?
• NASA survival and funding
• Space Task Group • 1969: Space Transportation System Program
(STS)– Permanent space station: 6 > 120 men at the top of
LEO– LEO shuttle– Inter orbit space tug– LEO to solar system NERVA engine shuttle
• 1972: Space Shuttle Program– STS program de-scope and cost reduction– NASA & DoD – Reusability, cost, … and much more promises– Use cases for the shuttle
James Fletcher & Richard Nixon
Space Shuttle Program approval
1972
Shuttle concepts - early 1970s
VP Spiro AgnewSpace Task Group
1969
1972 - 2011 (+ DoD)Space Shuttle Program
• Space shuttle purpose– LEO van (satellites, telescopes, Earth atmosphere research, …)– Hubble (559km): service missions (STS-31, STS-61, STS-125, …)
• Space shuttle = orbiter + external tank + solid rocket boosters– 7 astronauts, 1-2 week missions– 135* missions (1981 - 2011)– 2.000t full w/ 32t payload capacity to LEO
• Orbiter– 4-6 millions parts; 90 days check– $1.700.000.000 base price + $450.000.000/mission– 5 orbiters: Atlantis, Challenger, Columbia, Discovery, Endeavour
• Enterprise prototype
GPCs redundancy, IOPs, Data Buses (24) and MMUs > engines, boosters, tank, …
Onboard Hardware
B. J. ThomasManager Apollo/Saturn and
Shuttle HWIBM
Lynn KillingbeckSenior System Analyst
(HW redundancy)IBM
GPC = CPU + IOPHardware
General Purpose Computer
5x GPC + 2x MMU located below the cockpit
Main engine controller
From AGC/PGNCS to GPCHardware
• IBM AP-101– IBM mainframe architecture w/ unique IOP & bus
system– 2x8 32b registers, 154 instructions 550W, 29kg, MTBS
10.000h– US Army: B-52, B-1B (8 units), F-15 … (JOVIAL/Ada)– Advanced self HW/SW test
• Integrity: 95% of HW failures detected; 5% of SW failures via redundancy
– No HDD - tape cartridges instead (MMU) as SW didn’t fit
• GPCs for the shuttle: IBM AP-101B/S (IOP+bus)– 1st generation (1981-1989): 424kB of magnetic
core memory (Apollo AGC), 400.000 instructions/s
– 2nd generation (1990-2011): 1MB, 1.200.000 instructions/s (3x space & time); semiconductor memory w/ backup battery
• Onboard: 5x GPC = 4x PASS @ lockstep + 1x BFS
IBM AP-101B 1st … generationIBM AP-101S … 2nd generation
Core memory page Semiconductor memory board
RAM
Software: Space Shuttle Mission Sequence
SW driven mission sequence
PASS, HAL/S and OPSOnboard Software
• PASS: Primary Avionics Software System– System Software
• Flight Computer OS (FCOS) w/ redundancy ctrl
• UI• System Control Programs
– Application Software• Guidance & Navigation & Control• (Orbit) Systems Management• Payload & Checkout
• PASS Functions ~ Mission Sequence– Pre-flight > Ascent > On-orbit > Descent
• PASS Development– 420.000 lines in HAL/S (IBM Federal
Systems…)– 700kB (didn’t fit to GPC RAM > split to OPS)
• HAL/S (High-order Assembly Language/Shuttle)– Intermetrics: language (spec) and compiler
• Apollo veterans + Arra Avakian (linker, HP OpenView)
– Reliability + real-time environments support– Free form language: modules, functions, vector
arithmetic, multilines, …
• Operation Sequences (OPS)– OPSs implement PASS functions– OPS = SPECs (ctrl by human) + DISPs (UI)– OPS code loaded from MMU (data kept: vectors, …)
OPS overview: mission sequence like structure
Reliability via Redundancy and QualitySoftware: Redundancy
• Hardware/Software redundancy (deployment)
• PASS running on 4 GPCs in lockstep– On PASS GPC inconsistency/failure: GPCs vote to deselect
failed one– FCOS driven redundancy scheme solved by
NASA/Rockwell/IBM in 1975– Lockstep synced GPCs every 3-4ms on I/Os– OS redesigned to priority driven two level (40ms &
960ms) task scheduler- remind Margaret Hamilton’s PGNCS software and Moon landing overload
– On PASS GPCs total failure BFC takes control• Backup Flight Computer runs independently w/ different
SW • Never used
Annunciator (warning panel) Display Unit
Process and statistical analysis driven software developmentSoftware: Development & Quality
• PASS development– Started ’74 (Apollo + new hires), 1st flight in ’81, released every 6 - 9
months– 2.000 requirements– 420.000 lines of code
• … and 1.400.000 lines of code to build/test/develop/simulate/configure
– 275 people (‘95)
• Strategy to achieve high quality– Process
• manage+control+measure+analyze software via (meta)data collected to perform (statistical) analysis (30+ years of statistics, process improvements, experience and lessons learned… 25 year old bugs ;)
– Resources • enough people - highly skilled peers cooperate on small portion
of code• enough time• infrequent/tiny changes• heavy weight (7 level) testing• relatively small amount of code in contrast to commercial
avionics SW
James OrrChief Engineer
(PASS)United Space Alliance
Tony MacinaManager Flight
Operations(Test Team)
IBM
Small things make huge differenceLessons Learned
• Quality (Meta)data Creation– Commit messages, bug tracking system descriptions, review
reports, …– Analytics, metrics, statistics, …
• Incremental Process Improvement– Chronicle of systematic incremental improvements w/ analytics– Defect elimination process (+ analogous process improvement)
• Core Features Investment – Key parts/components of software to be built according to well
known quality principles w/ enough resources– People, time, reviews, changes, testing, code…
Anyone who sits on top of the largest hydrogen-oxygen fueled system in the world; knowing they're going to light the bottom — and doesn't get a little worried — does not fully understand the situation.
— John Young, after making the first Space Shuttle flight.