Xception

  • Increase font size
  • Default font size
  • Decrease font size

Case Study

XCEPTION ON THE FIELD

REE - REMOTE EXPLORATION AND EXPERIMENTATION PROJECT


The Xception tool is being used in the NASA Remote Exploration and Experimentation (REE) Project.

This project aims at moving commercial scalable supercomputing technology into space, in a form that meets the demanding environmental requirements, to enable a new class of science investigation and discovery.

Having high number-crunching capacity in space enables the processing of raw data in space, with two objectives. First, since only processed data has to be sent to earth, the limited bandwidth available is much better used.


Next Generation Space Telescope
[Original; Raw Final Image; Processed Image (no CRs); Processed Image]

Second, unmanned vehicles like the projected Mars Rover can make autonomous decisions, finding it's way in unknown areas.


Mars Rover.

Due to price, weight and power constraints, the computers delivering that high processing capability cannot be duplicated, which raises concerns about the correctness of the results because of the data corruption caused by cosmic rays. To prevent that, several fault-detection and fault-tolerance mechanisms have to be included in the system.

Xception comes in as the best test tool to test such a setup. By injecting bit-flips in several parts of the system according to prescribed patterns, cosmic rays are accurately simulated. The REE testbeds can thus be thoroughly tested and refined before being deployed.

Xception is in use in the REE project at the NASA Jet Propulsion Laboratory since May 2000.

CRON Project - Evaluation of Time Dependencies on Fault-Tolerant Real-Time Control Systems

This research project proposes the study of the applicability of low cost fault-tolerant techniques, based on error detection and recovery, into realtime systems.
Traditional fault-tolerant techniques applied to real-time systems are based on replicating the system and voting the results because of the very low time overhead. However, due the high cost of these techniques, the great majority of real-time control systems used in industry, medical devices, vehicles, etc. do not have almost any fault-tolerance mechanism built in, despite the importance of reliance on these systems.
This project explores the fact that these controller computers have usually a large percentage of idle time, and uses this time to execute some error detection and recovery techniques.
Experimental validation of such proposal has already been performed by means of extensive fault injection campaigns on the controller of a physical process. The chosen testbed was an Inverted Pendulum, since it is a very fast and unstable process.


Inverted Pendulum (picture taken from Amira web site.

RT-Xception was the elected fault-injection tool, since it has almost no probe effect, i.e., it introduces a negligible time overhead, and thus doesn’t bring any undesirable time disturbance inside the controller.

RT-Xception was able to inject faults into precise locations of the control algorithm, allowing a detailed study on particular controller characteristics.
This project is being carried out at the Dependable Systems Group (DSG) of CISUC, and is led by Prof. Mário Rela, Prof. João Gabriel Silva and Prof. João Cunha.