Bringing Fault-Tolerant GigaHertz-Computing to Space: A Multi-Stage Software-Side Fault-Tolerance Approach for Miniaturized Spacecraft

08/23/2017
by   Christian M. Fuchs, et al.
0

Modern embedded technology is a driving factor in satellite miniaturization, contributing to a massive boom in satellite launches and a rapidly evolving new space industry. Miniaturized satellites, however, suffer from low reliability, as traditional hardware-based fault-tolerance (FT) concepts are ineffective for on-board computers (OBCs) utilizing modern systems-on-a-chip (SoC). Therefore, larger satellites continue to rely on proven processors with large feature sizes. Software-based concepts have largely been ignored by the space industry as they were researched only in theory, and have not yet reached the level of maturity necessary for implementation. We present the first integral, real-world solution to enable fault-tolerant general-purpose computing with modern multiprocessor-SoCs (MPSoCs) for spaceflight, thereby enabling their use in future high-priority space missions. The presented multi-stage approach consists of three FT stages, combining coarse-grained thread-level distributed self-validation, FPGA reconfiguration, and mixed criticality to assure long-term FT and excellent scalability for both resource constrained and critical high-priority space missions. Early benchmark results indicate a drastic performance increase over state-of-the-art radiation-hard OBC designs and considerably lower software- and hardware development costs. This approach was developed for a 4-year European Space Agency (ESA) project, and we are implementing a tiled MPSoC prototype jointly with two industrial partners.

READ FULL TEXT

page 2

page 4

page 6

page 7

research
03/21/2019

Fault-Tolerant Nanosatellite Computing on a Budget

Micro- and nanosatellites have become popular platforms for a variety of...
research
02/22/2019

Dynamic Fault Tolerance Through Resource Pooling

Miniaturized satellites are currently not considered suitable for critic...
research
09/23/2022

Analysis of Fault Tolerant Multi-stage Switch Architecture for TSN

We conducted the feasibility analysis of utilizing a highly available mu...
research
03/15/2022

A Survey of fault models and fault tolerance methods for 2D bus-based multi-core systems and TSV based 3D NOC many-core systems

Reliability has taken centre stage in the development of high-performanc...
research
01/15/2019

Self-Stabilization Through the Lens of Game Theory

In 1974 E.W. Dijkstra introduced the seminal concept of self-stabilizati...
research
03/18/2022

Collaborative Computing Support for Analysis Facilities Exploiting Software as Infrastructure Techniques

Prior to the public release of Kubernetes it was difficult to conduct jo...
research
09/22/2021

GPU4S: Embedded GPUs in Space – Latest Project Updates

Following the trend of other safety-critical industries like automotive ...

Please sign up or login with your details

Forgot password? Click here to reset