accelerated long range traverse (alert) paul springer michael mossey
TRANSCRIPT
Accelerated Long Range Traverse (ALERT)
Paul Springer
Michael Mossey
Task Manager: Larry H. Matthies
Cognizant Engineer: Paul Springer
Participating Organization:JPLAmes
FacilitiesRocky 8
Accelerated Long Range Traverse (ALERT)
Task Schedule
Objective:Demonstrate feasibility of using a commercial grade processor as a coprocessor for a Mars surface mission
Motivation:COTS computing could double rover exploration range, reduce mission cost, and enable advanced EDL functions for pin-point landing and landing hazard avoidance
Level 1 & 2 Milestones Feb Jul Sep
Demonstrate fault tolerance during application fault injection on rover
X
Demonstrate fault tolerance on rover running traverse science application
X
Demonstrate system fault tolerance during O/S & application fault injection on rover
X
COTS processor
Overview
• Radiation effects models for Mars surface predict approximately one SEU every 50 hours of operation for a COTS PowerPC 74xx series processor
• Use of a COTS coprocessor for future Mars rover mission has potential to significantly increase computing throughput and science return without increasing mission cost
- Rover exploration range could double
- Coprocessor could do traverse science image analysis while rover is moving
• Realizing this promise requires demonstrating fault tolerance of rover flight software via simulated fault injection
• Success with this application could enable use of COTS processors in a variety of other Code S missions
- E.g. Comet Nucleus Sample Return, Asteroid Lander, NGST
- Applicable to Space Station to improve reliability when using COTS products
Technical Strategy
• Task strategy:(1) show fault tolerance of MER stereo vision in FY02/03
(2) extend to all of MER obstacle avoidance system and one traverse science application in FY03
(3) Use FY03 Tools to address other missions (eg. Station) in FY04
• Technical Approach- Build low cost virtual testbed for development/testing
• Key facility is software implemented fault injector to replicate SEU environment
- Augment JPL rover with COTS G4 coprocessor (Apple Laptop running Linux)
• Includes fault injector
- Develop OS fault injector (only have application fault injector today)
- Conduct a series of demos using real flight applications
• Justification- Irradiating equipment to produce SEUs is costly and impractical
- Real flight application demos are need to satisfy flight project managers
FY03 Milestones
Conduct fault injection simulations with stereo vision application and analyze results.(09/02)
Design and implement fault protection algorithm (11/02)
Demonstrate fault protection during Rocky8 autonomous traverse (02/03)
Complete porting, fault injection, and design of fault mitigation strategies for Gestalt navigation software (05/03)
Develop fault model that takes into account Mars surface radiation levels and microprocessor susceptibility. (6/03)
Complete porting, fault injection, and design of fault mitigation strategies for traverse science application. Demonstrate fault protection with simulated fault injection during autonomous traverse with rover. (7/03)
Demonstrate fault protection for Gestalt and stereo applications running while faults are injected into them and the O/S, during autonomous traverse with rover. (9/03)
Level 1: Use the Rocky8 testbed and a COTS coprocessor to demonstrate that an autonomous traverse can successfully complete while faults are being injected into the coprocessor’s stereo imaging application.
Level 1: Use the same hardware to demonstrate an autonomous traverse while faults are being injected into imaging & Gestalt applications and O/S.
Level 2:
Recent Accomplishments
Integrated laptop image processing with rover based obstacle avoidance software
-Significance: the combination of rover and laptop models the use of a fast coprocessor for the acceleration of compute intensive rover tasks
Developed fault tolerant strategies for occasional single bit upsets (SEUs) in laptop application
-Significance: these strategies can be used in a Martian environment to handle radiation induced SEUs occurring in a COTS coprocessor
Demonstrated correct autonomous traverse of rover while faults were injected into the laptop application
-Significance: Proof of concept that a commercial coprocessor can be used in a low-radiation environment
Autonomous Traverse Demonstration
Approach:
• Use Macintosh G4 laptop running Yellow Dog Linux as co-processor• Develop interface between stereo imaging application on laptop and Claraty rover control
software.• Develop Claraty code to monitor application for hangs, halts, and data errors. Restart/rerun
stereo application as necessary• Install fault injector on laptop to simulate radiation-generated bit flips in application
Demonstration Highlights:
• Fault injection rate set to 100,000 x expected rate• Rover computer successfully handled all problems generated by fault injector• Demonstrated use of time redundancy fault tolerance technique• No fault tolerant modifications to application code were necessary
ALERT Architecture
LANRadio link
Workstation
HubRadio link
Laptop
Claraty Control Software Cameras
Rocky 8
Gestalt Navigation Software
6. Workstation used for monitoring logs
3. Laptop runs JPLStereo imaging
application, performs fault injection, and
returns disparity map
5. Gestalt receives disparity map from Clarity and steers
rover
2. Claraty initiates application and sends it
images
1. User logs in to rover computer and starts Claraty
4. Claraty checks disparity map for
correctness
Current Quarter AccomplishmentsO/S Fault Injector
Developed tool for fault injection into Linux operating system
- Significance: Permits simulation of the affects of radiation induced SEUs on the operating system
Linux Operating System
JPLStereo Application
Fault Injector Thread
Key
Startup process
Fault Injection
Plans for Remainder of FY03
• Integrate remainder of MER obstacle avoidance software (“GESTALT”) into Apple laptop (May’03)
• Demonstrate fault tolerance of an ARC traverse science application (July’03)
• Rocky 8 traverse of Mars Yard showing tolerance of faults injected into both application and operating system (Sept’03)