recent progress in the apl fault management process · storm probes (rbsp) project ! presentation...
TRANSCRIPT
Recent Progress in the APL Fault Management Process Kristin Fretz & Adrian Hill JHU/APL
§ 2008 FM Workshop published a White Paper Report identifying 12 top-level findings and recommendations for improving FM systems
§ APL recognized these needs and improved its FM and Autonomy processes to address many of these findings q Processes are part of APL’s Quality
Management System (QMS) q Changes implemented on the Radiation Belt
Storm Probes (RBSP) project § Presentation will highlight 6 of the
findings, discuss the APL process changes, and the impact of the changes on both the FM and Autonomy processes
Overview
APL FM Development Process
RBSP Overview
MOC EMFISIS SOC
Inclina.on 10°
Perigee Al.tudes 605 & 625 km
2 Observatories (RBSP-‐A, RBSP-‐B) • Nearly iden.cal, single string architecture • Spin Stabilized ~5 RPM • Spin-‐Axis 15°-‐27° off Sun
• APtude Maneuvers Every 21 days • No on-‐board G&C aPtude control
• Opera.onal Design Life of 2 years
Launch and Orbit Inser=on • Single EELV (Observatories Stacked) • Launch from KSC • Each observatory independently released Sun pointed
• LV performs maneuver to achieve nominal lapping rate
• Launch: May 2012
APL Ground Sta.on • Primary
NEN Sta.on(s) • Data Augmenta.on • Back-‐up
Commands & Telemetry
Instrument commands
Instrument Telemetry EFW
SOC ECT SOC
PSBR SOC
RBSPICE SOC
Decoupled Opera.ons • Basic Approach Based on TIMED, STEREO
Differing apogees allow for simultaneous measurements to be taken over the full range of observatory separa.on distances several .mes over the course of the mission. This design allows one observatory to lap the other every 75 days.
TDRSS
Cri.cal Events at Launch
Apogee Al.tudes 30,410 &
30,540 km
Finding: § “Responsibility for FM currently is diffused throughout
multiple organizations; unclear ownership leads to gaps, overlap and inconsistencies in FM design, implementation and validation”
Recommendation: § “Establish clear roles and responsibilities for FM
engineering” § “Establish a process to train personnel to be FM
engineers and establish or foster dedicated education programs in FM”
Finding 2: Diffused FM
§ APL split FM functionality into two distinct roles: Fault Management and Autonomy
§ FM is a system engineering function that: q Develops concept of operations q Develops FM architecture q Oversees reliability analyses such as FMEA/PRA q Defines requirements and allocation of those requirements to
hardware, software, autonomy, and operations q Verifies and validates system with mission-level tests during I&T
§ Autonomy is a software engineering function that: q Designs and implements autonomy rule-based system derived
from allocated FM requirements q Performs unit-level/subsystem-level verification
Implemented Change for Finding 2: Diffused FM
Finding: § “There is insufficient formality in the documentation of
FM designs and architectures, as well as a lack of principles to guide the processes”
Recommendation: § “Identify representation techniques to improve the
design, implementation and review of FM systems” § “Establish a set of design guidelines to aid in FM design”
Finding 4: Insufficient Formality of Documentation
§ APL’s QMS has formal FM and Autonomy engineering processes which define required documentation and reviews
Implemented Change for Finding 4: Insufficient Formality of Documentation
FM Autonomy
FM Architecture Document N/A
FM Requirements Document Autonomy Requirements Specifica.on
FM Design Specifica.on Autonomy Design Spec & Users Guide
FM Test Plan & Verifica.on Matrix Autonomy Acceptance Test Plan
FM Test Procedures Autonomy Acceptance Test Specifica.on
FM Test Reports Autonomy Acceptance Test Report (includes verifica.on matrix)
§ RBSP FM and Autonomy developed system and subsystem diagrams to communicate information captured in architecture, requirements, and specifications
Implemented Change for Finding 4: Insufficient Formality of Documentation
OPERATIONAL MODE •Normal power-positive conditions exist •Spacecraft is available for science data collection
SAFE MODE •Protects against life-threatening low-power or communication fault •No science data collection
Safe ConfigurationLaunch
Configuration
Power
Avionics Communication
FSW off-pulse command to PSE I/F card Load over-current/power
Instrument power downrequest, instrument
heartbeat failure, or over-power
Battery electronics over-powerLow battery current not during eclipse or current controller over-power
SSPA over-power
Separated
Latch Valvecontinuous current
Minimum sun angle violation, thruster armed in eclipse,off-Nominal spin rate, or
thruster arm time-out
Battery over-temperature Battery Temperaturereturns to nominal
G&C/Propulsion
Payload
Nominal Operations Configurations
RF processor watchdogtimeout
SWCLT expiration orTransceiver over-power
IEM processor watchdog timeout
C&DH FSW self-initiated reset XCVR FSW
self-initiated reset
RF CCD off-pulse command IEM off-pulse
RF CCD Command
IEM Reset CCD Command
Sys
tem
Resp
onse
Loc
atio
n
Lo
cal
Ground Commands
KEYAutonomyHardware
MOPSFSW
HWCLT ExpirationLVS or LBSOC
Launch Vehicle Separation
LVS orLBSOC
NotSeparated
Separated
Separated & LVS or LBSOC
Max Sun Angle Exceedance
Post-Separation Sequence
Rule that triggers post-sep macro disabled by ground after confirmation that macro successfully completed
PDU Detects 3 of 4 Switches and Removes Inhibit
Post-Sep Macro (Enable Safety
Buses, RF ON, Sun Sensor ON)
Autonomy Detects 3 of 4
Switches via IEM telemetry
IEM Provides Switch Status to
AutonomyNominal -OR- Fault SA Deploy Macro
Battery Discharge Protection
Power OffCurrent
ControllerReset PSE
Interface Card
Battery Over-Temperature Recovery
Enable Coulometer
Charge Control
Enable Over-Temperature
Protection
Battery Over-Temperature Protection
Disable Coulometer
Charge Control
Set Charge Current to
Trickle
Enable Over-Temperature
Recovery
VT Off-Pulse
1 of 3 VT Regulators Off-Pulsed
(Limited by PSE I/F)
PDU Load Shed Sequence
PGS VT-onlySwitched Loads OFF
Battery Electronics Protection
Disconnect Battery
ElectronicsTurn Battery
Electronics OffNon-Optimal
Battery Maintenance
PDU Off-Pulse via RF CCD
PDU Off-PulseGround
Reconfigures PDU
Load Over-Current
CB or AUT removes
Switched Power from Load
Hardware Command Loss Timer Sequence
XCVR Off-Pulse
PDU Off-Pulse
IEM Off-Pulse
Reinitialize File System(Destructive)
Software Command Loss Timer
Establish Emergency
Comm
Off-pulse XCVR
(3x max)
RF Electronics defaults to Emergency
Comm
RF Electronics Protection
Power Off SSPA
RF Processor Reset
XCVR FSW Reboots
Establish Emergency
Comm
RF Electronics defaults to Emergency
Comm
Maneuver AbortDisable Thruster Fire I/F & Power
Off Catbeds
Latch Valve Continuous-Current
Power-Off Latch Valve
Individual Instrument Safing
Power Off Instrument
IEM Reset via IEM CCD
*Resets SBC, SSR (lose open files only), SCIF (non-critical)
Recover Existing SSR File System
Reset IEM non-critical modules*
IEM Off-Pulse via RF CCD
*Resets SBC, SSR (lose all data), SCIF (full FPGA)**Restore MET, reload time-tagged cmds & SW updates, restart science data collection
Interrupt IEM Unswitched
Power*
Reinitialize File System(Destructive)
Ground Recovery of
IEM**
IEM Non Power-On Reset Recover
Existing SSR File System
(3x max)
C&DH FSW
RebootsSpin Pulse
ManagementEstablish
Emergency Comm
Go-Safe SequenceDisable
Thruster Fire I/F & Power Off
Catbeds
Power on RF Downlink &
Configure Emer Comm
BME, PSE CC, &
Instruments OFF
Load Re-enforcement
(SS, HTRs, PSE I/F ON)
Resume SSR Operations
Assert PDU Go-Safe Relay
Disable C&DH/Delete Inst Time-tag
Commands
Soft LVS
Separated
KEYAutonomyHardware
MOPSFSW
Communication
SSPA over-power
RF processor watchdogtimeout
SWCLT expiration orTransceiver over-power
XCVR FSW self-initiated reset
HWCLT Expiration
Hardware Command Loss Timer Sequence
XCVR Off-Pulse
PDU Off-Pulse
IEM Off-Pulse
Reinitialize File System(Destructive)
Software Command Loss Timer
Establish Emergency
Comm
Off-pulse XCVR
(3x max)
RF Electronics defaults to Emergency
Comm
RF Electronics Protection
Power Off SSPA
RF Processor Reset
XCVR FSW Reboots
Establish Emergency
Comm
RF Electronics defaults to Emergency
Comm
§ Example of Fault Management Modes Diagram used to convey interactions between Hardware, Software, Autonomy, and Ground/Operations for managing spacecraft faults
Implemented Change for Finding 4: Insufficient Formality of Documentation
SSR MAINTENANCE
SSR INITIALIZATIONSAFE MODE
LATCH VALVE ACTUATION
PROCESSOR RAM DISK
PGS
MANEUVER ABORTS
PRESSURE XDCRS
LAUNCHRF COMMUNICATIONS
PERFORM GOSAFE [M10]q CALL DISABLE THRUSTERS [M20]q Delete instrument timetagsq Suspend C&DH timetags at priorities 4-15q CALL POWER OFF BATTERY
MANAGEMENT ELECTRONICS [M42]q Power off PSE current controllerq CALL SHUTDOWN PAYLOAD [M47]q CALL POWER ON RF DOWNLINK PATH
AND CONFIGURE EMERGENCY COMM [M01]
q CALL RECORD ENGINEERING DATA TO SSR [M27]
q Power on propulsion module heatersq Power on sun sensor electronics unitq Power on bus survival heatersq Inhibit PDU Response to Low Battery State of
Charge from PSE Interface Cardq Power on PSE interface cardq Assert PDU Go-Safe relay
DISABLE THRUSTERS [M20]q Disable IEM thruster fire
interfaceq Power off catbed heaters
DETECT SPIN RATE ERROR [R20]If spacecraft spin rate is below Minimum Spin Rate [SV07] or exceeds Maximum Spin Rate [SV06] and manuever in progress
DETECT MANEUVER EXECUTION TIMEOUT [R21]If maneuver execution longer than Maximum Maneuver Time [SV08]
DETECT MANEUVER EXECUTION IN ECLIPSE [R22]If solar array current is below limits or no sun pulses and maneuver in progress
DETECT HARDWARE LVS [R10]If Hardware LVS detected and either Mission Phase [SV03] is ORBIT or Deployments Complete Flag [SV11] is TRUE
DETECT LOW BATT SOC [R11]If LBSOC is detected and and either Mission Phase [SV03] is ORBIT or Deployments Complete Flag [SV11] is TRUE
DETECT SUN ANGLE MAX [R12]If sun angle exceeds Maximum Sun Angle [SV04] and either Mission Phase [SV03] is ORBIT or Deployments Complete Flag [SV11] is TRUE
DETECT SUN ANGLE VIOLATION [R23]If sun angle is below Maneuver Min Sun Angle [SV10] or beyond Maneuver Max Sun Angle [SV09] and maneuver in progress
DETECT LV-1 OPEN/CLOSE CMD CONTINUOUS CURRENT [R33]If latch valve-1 open or close command asserts continuous current for 2 seconds
REMOVE LATCH VALVE-1 OPEN/CLOSE PULSE POWER [M33]q Terminate latch valve-1 open
command pulse actuationq Terminate latch valve-1 close
command pulse actuation
DETECT LV-2 OPEN/CLOSE CMD CONTINUOUS CURRENT [R34]If latch valve-2 open or close command asserts continuous current for 2 seconds
REMOVE LATCH VALVE-2 OPEN/CLOSE PULSE POWER [M34]q Terminate latch valve-2 open
command pulse actuationq Terminate latch valve-2 close
command pulse actuation
DETECT XOVER LV OPEN CMD CONTINUOUS CURRENT [R35]If xover latch valve open command asserts continuous current for 2 seconds
REMOVE XOVER LATCH VALVE OPEN PULSE POWER [M35]q Terminate xover latch valve
open cmd pulse actuation
A
To SHUTDOWN
PAYLOAD Macro
(Page 2)
PERFORM POST-SEPARATION SEQUENCE [M07]q Enable RF safety buses (A&B)q Enable actuator safety buses (A&B)q Enable propulsion safety buses (A&B)q Enable battery conn relays (1&2)q Enable SAJB Relay at Reduced Curr (A&B)q Select VT Level #4q Disconnect BME cell bypass relayq CALL POWER ON RF DOWNLINK PATH
AND CONFIGURE EMERGENCY COMM [M01]
q Power on sun sensor electronics unit
*** This macro is disabled by default for I&T ***
DETECT SEPARATION [R05]If 3 (or more) of 4 switches indicate separation and Mission Phase [SV03] is LAUNCH for 5 seconds
PERFORM X-AXIS SA DEPLOYMENTS [M08]q Fire +X primary / -X
redundant actuatorsq Sleep 5 secondsq Fire +X redundant / -X
primary actuators
DETECT HARDWARE CLT [R13]If C&DH processor reboots as result of HW CLT expiration (HWCLT Relay set) and (spacecraft is separated or Mission Phase [SV03] is ORBIT)
POWER ON RF DOWNLINK PATH AND CONFIGURE EMERGENCY COMM [M01]q Power on SSPA (A)q Sleep 5 secondsq Issue transceiver firmware
resetq Sleep 15 secondsq CALL CONFIGURE RF FOR
EMERGENCY COMM [M19]
OFF PULSE TRANSCEIVER [M16]q Off-pulse transceiver (h/w-
limited to 3 attempts)
DETECT TRANSCEIVER SOFTWARE REBOOT [R17]If transceiver flight software reboots unexpectedly
off pulse will cause transceiver software reboot
CONFIGURE RF FOR EMERGENCY COMM [M19]q Configure FSW tables for
emergency downlink ratesq Disable FSW playbackq Sleep 30 secondsq Configure transceiver for
emergency uplink/downlink
DETECT SW CLT TIMEOUT [R15]If time elapsed sinceTime of Last SWCLT Restart [SV01] exceeds SWCLT Interval [SV02] and (spacecraft is separated or Mission Phase [SV03] is ORBIT)
POWER ON RF DOWNLINK PATH AND OFF PULSE TRANSCEIVER [M15]q Power on SSPA (A)q CALL OFF PULSE
TRANSCEIVER [M16]q Disable baseband uplink
DETECT TRANSCEIVER EXCESS POWER [R16]If transceiver power exceeds limits
DETECT ANY CDH REBOOT [R01]If C&DH processor reboots for any reason and (spacecraft is separated or Mission Phase [SV03] is ORBIT)
POWER OFF SSPA [M18]q Power off SSPA (A&B)
DETECT SSPA EXCESS POWER [R18]If SSPA power exceeds limits
DETECT BATTERY MANAGEMENT ELECTRONICS EXCESS POWER [R42]If battery management electronics power exceeds limits
POWER OFF BATTERY MANAGEMENT ELECTRONICS [M42]q Throw relay to disconnect
Battery Management Electronics from battery
q Sleep 2 secondsq Power off Battery
Management Electronics
DETECT UNEXPECTED BATTERY DISCHARGE [R48]If battery is discharging while shunt current is above acceptable limits
RESET PSE INTERFACE CARD [M46]q Inhibit PDU Response to Low Battery
State of Charge from PSE Interface Card
q Power off PSE current controllerq Reset PSE Interface Card
DETECT PSE CURR CNTRLR OVERCURRENT [R46]If PSE current controller load current exceeds limits
DETECT BATTERY OVER-TEMPERATURE [R40]If battery temperature exceeds over-temp limit and charge rate is above limit and coulometer charge control is enabled
DISABLE COULOMETER CHARGE CONTROL [M40]q Disable coulometer charge
controlq Set battery chrg rate to C/100
RESUME BATTERY OVERTEMP MONITORING [R41]If battery temperature is below acceptable limits and PSE current controller is powered on and coulometer charge control is disabled
RESUME COULOMETER CHARGE CONTROL [M41]q Enable coulometer charge
control
DETECT PSE INTERFACE CARD EXCESS POWER [R44]If PSE Interface card power exceeds limits
POWER OFF PSE INTERFACE CARD AND CURRENT CONTROLLER [M44]q Inhibit PDU Response to Low
Battery State of Charge from PSE Interface Card
q Power off PSE interface cardq Power off PSE current
controller
DETECT CDH REBOOT WITH SSR CONTENTS PRESERVED [R03]If C&DH processor reboots and SSR contents are preserved (subject to maximum 3 attempts)
RESUME SSR RECORDING [M03]q Enable SSR memory scrubq Recover existing SSR file systemq Sleep 5 secondsq CALL RECORD ENGINEERING
DATA TO SSR [M27]q CALL RECORD INSTRUMENT
DATA TO SSR [M28]
DETECT CDH REBOOT WITH SSR CONTENTS LOST [R04]If C&DH processor reboots and SSR contents are lost (e.g., IEM Off-Pulse) and SSR H/W Initialization of its SDRAM is complete
REINITIALIZE SSR RECORDING [M04]q Enable SSR memory scrubq Create new SSR file system
(destructive)q Sleep 5 secondsq CALL RECORD ENGINEERING
DATA TO SSR [M27]
DETECT BATTERY MANAGEMENT ELECTRONICS CIRCUIT BREAKER TRIP [R43]If battery management electronics trips circuit breaker
DETECT CDH REBOOT [R02]If C&DH reboots and this is the first reboot since last ground contact
RECORD HK TO PROCESSOR RAM DISK [M02]q Change record rate of G&C Maneuver
Tlm Packet to 1 packet every 300 seconds
q Record engineering data (h/k, evt msgs, cmd status) to local RAM disk
q Freeze recording of CFE_EVS logq Dump memory scrub errors to RAM
diskq Sleep 60 minutesq Halt recording of engineering data to
local RAM diskq Restore record rate of G&C Maneuver
Tlm Packet to 1 packet every second
DETECT SA DEPLOY READY [R06]If 3 (or more) of 4 switches indicate sep -AND- Mission Phase [SV03] is LAUNCH -AND- Time of Separation [SV05] is non-zero -AND-( time elapsed since Time of Separation [SV05] exceeds SA Deploy Delay Interval [SV13] -OR- LVS detected -OR- LBSOC is detected )
DETECT PSE INTERFACE CARD CIRCUIT BREAKER TRIP [R45]If PSE Interface card trips circuit breaker
POST-SEPARATION SEQUENCE EXECUTIVE [M05]q Set Time of Separation [SV05] to current
FSW uptimeq CALL PERFORM POST-SEPARATION
SEQUENCE [M07]q Sleep 20 secondsq CALL PERFORM POST-SEPARATION
SEQUENCE [M07]
SA DEPLOYMENTS EXECUTIVE [M06]q Sleep 60 secondsq CALL PERFORM X-AXIS SA DEPLOYMENTS [M08]q Sleep 60 secondsq CALL PERFORM Y-AXIS SA DEPLOYMENTS [M13]q Sleep 60 secondsq CALL PERFORM X-AXIS SA DEPLOYMENTS [M08]q Sleep 60 secondsq CALL PERFORM Y-AXIS SA DEPLOYMENTS [M13]q Set Deployments Complete Flag [SV11] to TRUE
DETECT CDH REBOOT AND CATBED ON [R24]If C&DH processor reboots for any reason and catbed heaters were left powered on
DETECT PSE CURR CNTRLR CIRCUIT BREAKER TRIP [R47]If PSE current controller trips circuit breaker
DETECT PRESSURE TRANSDUCERS OVERCURRENT [R29]If pressure transducers load current exceeds limits
POWER OFF PRESSURE TRANSDUCERS [M29]q Power off pressure
transducers
DETECT XOVER LV CLOSE CMD CONTINUOUS CURRENT [R38]If xover latch valve close command asserts continuous current for 2 seconds
REMOVE XOVER LV CLOSE PULSE POWER [M38]q Terminate xover latch valve
close cmd pulse actuation
SUN SENSORDETECT SUN SPIN PULSE USE MISCONFIGURATION [R31]If sun sensor based spin pulse is selected but sun presence is not detected (i.e., sun sensor eclipse)
SELECT SIMULATED SPIN PULSE [M31]q Configure IEM to use
Hardware Timer Spin Pulse
DETECT SIMULATED SPIN PULSE USE MISCONFIGURATION {R32]If hardware timer spin pulse is selected but sun presence is detected (i.e., sun sensor pulse is sensed)
SELECT SUN SENSOR SPIN PULSE [M32]q Configure IEM to use Sun
Sensor Spin Pulse
DETECT SUN SENSOR OVERCURRENT [R30]If sun sensor load current exceeds limits
POWER OFF SUN SENSOR [M30]q Power off sun sensor
electronics unitq Configure IEM to use
Hardware Timer Spin Pulse
SAVE TRANSCEIVER REBOOT CAUSE [M17]q Save transceiver reboot flags
in Last Transceiver Reboot Cause [SV12]
q Clear transceiver reboot flagsq CALL CONFIGURE RF FOR
EMERGENCY COMM [M19]
LEGEND
MACRO [ID]
RULE [ID]
MOPS
Stor Var [ID]
PERFORM Y-AXIS SA DEPLOYMENTS [M13]q Fire +Y primary / -Y
redundant actuatorsq Sleep 5 secondsq Fire +Y redundant / -Y
primary actuators
C
RECORD INSTRUMENT DATA TO SSR [M28]q Open files on SSR to record
instrument data
RECORD ENGINEERING DATA TO SSR [M27]q Open files on SSR to record
engineering data (includes housekeeping, event message and command status packets)
DETECT SSR RESET [R26]If FSW has induced an SSR Reset
RE-ENABLE SSR MEMORY SCRUB [M26]q Re-enable SSR memory scrubq Clear count of FSW-induced
SSR resets
DETECT SOFTWARE LVS [R14]If batt voltage is less than Software LVS Threshold [SV14] and either Mission Phase [SV03] is ORBIT or Deployments Complete Flag [SV11] is TRUE
LEGEND
MACRO [ID]
RULE [ID]
MOPS
Stor Var [ID]
RF COMMUNICATIONS
POWER ON RF DOWNLINK PATH AND CONFIGURE EMERGENCY COMM [M01]q Power on SSPA (A)q Sleep 5 secondsq Issue transceiver firmware
resetq Sleep 15 secondsq CALL CONFIGURE RF FOR
EMERGENCY COMM [M19]
OFF PULSE TRANSCEIVER [M16]q Off-pulse transceiver (h/w-
limited to 3 attempts)
DETECT TRANSCEIVER SOFTWARE REBOOT [R17]If transceiver flight software reboots unexpectedly
off pulse will cause transceiver software reboot
CONFIGURE RF FOR EMERGENCY COMM [M19]q Configure FSW tables for
emergency downlink ratesq Disable FSW playbackq Sleep 30 secondsq Configure transceiver for
emergency uplink/downlink
DETECT SW CLT TIMEOUT [R15]If time elapsed sinceTime of Last SWCLT Restart [SV01] exceeds SWCLT Interval [SV02] and (spacecraft is separated or Mission Phase [SV03] is ORBIT)
POWER ON RF DOWNLINK PATH AND OFF PULSE TRANSCEIVER [M15]q Power on SSPA (A)q CALL OFF PULSE
TRANSCEIVER [M16]q Disable baseband uplink
DETECT TRANSCEIVER EXCESS POWER [R16]If transceiver power exceeds limits
DETECT ANY CDH REBOOT [R01]If C&DH processor reboots for any reason and (spacecraft is separated or Mission Phase [SV03] is ORBIT)
POWER OFF SSPA [M18]q Power off SSPA (A&B)
DETECT SSPA EXCESS POWER [R18]If SSPA power exceeds limits
SAVE TRANSCEIVER REBOOT CAUSE [M17]q Save transceiver reboot flags
in Last Transceiver Reboot Cause [SV12]
q Clear transceiver reboot flagsq CALL CONFIGURE RF FOR
EMERGENCY COMM [M19]
§ Example of Autonomy Design Diagram used to convey implementation of monitors and responses that comprise the on-board autonomy system for responding to faults
Finding 7: § “The impact of mission-level requirements on FM complexity
and V&V is not fully recognized” q “Review and understand the impacts of mission-level requirements
on FM complexity; FM designers should not suffer in silence, but should assess and elevate impacts to the appropriate levels of management”
Finding 8: § “FM architectures often contain complexity beyond what is
defined by project specific definitions of faults and required fault tolerance”
§ “Increased FM architecture complexity leads to increased challenges during I&T and mission operations” q “Assess the appropriateness of the FM architecture with respect to
the scale and complexity of the mission, and the scope of the autonomy functions to be implemented within the architecture”
Findings 7 & 8: FM Complexity
§ FM involvement on RBSP started in early phase-A as a part of the systems engineering team q Involved in mission-level trades and assessed impact of
mission-level decisions on FM q FM “ownership” of requirements at all levels with
requirements allocated from mission-to-system-to-subsystem
§ Development of FM Architecture Document defined FM approach for the project q Architecture document now a formal part of FM process
which helped to communicate FM concepts/approach q Resulted in project “buy-in” on FM approach early in
mission development
Implemented Change for Findings 7 & 8: FM Complexity
Finding: § “Inadequate testbed resources is a significant schedule driver
during V&V” Recommendation: § “Develop high-fidelity simulations and hardware testbeds to
comprehensively exercise the FM system prior to spacecraft level testing”
Finding 11: Inadequate Testbed Resources
§ RBSP flight software testbeds available for both FM and Autonomy testing q Available early and provided enough fidelity to verify
Autonomy system q Allowed for dry-running portions of FM system-level tests
§ RBSP project also developed high-fidelity hardware-in-the-loop (HIL) simulator q Purpose of HIL was to provide high-fidelity test platform for
the development of FM system tests prior to spacecraft I&T • Due to hardware issues completion and availability of HIL was
delayed until late I&T q FM system-level testing (dry-run and for-score) was
performed on spacecraft without the opportunity to dry-run on the HIL
Implemented Change for Finding 11: Inadequate Testbed Resources
Finding: § “Unexpected cost and schedule growth during final system
integration and test are a result of underestimated Verification and Validation (V&V) complexity combined with late resource availability and staffing”
Recommendation: § “Allocate FM resources and staffing early, with appropriate
schedule, resource scoping, allocation, and prioritizing; schedule V&V time to capitalize on learning opportunity”
§ “Establish Hardware / software / “sequences” /operations function allocations within an architecture early to minimize downstream testing complexity”
§ “Engrain FM into the system architecture. FM should be “dyed into design” rather than “painted on””
Finding 1: Cost & Schedule
§ RBSP spacecraft-level testing for both FM and autonomy started early in I&T q FM tests were ready at start of I&T which allowed for dry-
running of tests (utilized partially integrated spacecraft) q Staffing for FM and Autonomy testing increased in
response to lessons learned from New Horizons and MESSENGER
§ FM Design Specification clearly defined allocations of FM functions to hardware, software, autonomy, and operations
§ Improved approach, planning, and staffing resulted in avoiding “death spiral” and staffing “bump” during I&T
Implemented Change for Finding 1: Cost & Schedule
Implemented Change for Finding 1: Cost & Schedule
-‐70.00 -‐60.00 -‐50.00 -‐40.00 -‐30.00 -‐20.00 -‐10.00 0.00 10.00
Staff
Mon
ths
Months to Launch
§ Past missions resulted in staffing “bump” during I&T
-‐70.00 -‐60.00 -‐50.00 -‐40.00 -‐30.00 -‐20.00 -‐10.00 0.00 10.00
Staff
Mon
ths
Months to Launch
§ Noticeable change in shape of staffing curve for RBSP q FM/Autonomy waited on the spacecraft for testing rather than spacecraft waiting
on FM/Autonomy q Remaining RBSP time is projected
KEY: Past APL Mission #1 Past APL Mission #2 Past APL Mission #3 RBSP
§ Significant changes seen in staffing curve for RBSP as compared to previous APL missions
§ Credit process changes for these improvements: q Better defined FM/Autonomy roles q Required documentation at key milestones q Use of diagrams as communication tool for documentation q Early program involvement to manage complexity q Improved testbed resources
§ APL FM/Autonomy processes continuing to evolve by using lessons learned from RBSP and findings from 2008 FM Workshop
Conclusion