Heterogeneity-aware Fault Tolerance using a Self-Organizing Runtime System

05/12/2014
by   Mario Kicherer, et al.
0

Due to the diversity and implicit redundancy in terms of processing units and compute kernels, off-the-shelf heterogeneous systems offer the opportunity to detect and tolerate faults during task execution in hardware as well as in software. To automatically leverage this diversity, we introduce an extension of an online-learning runtime system that combines the benefits of the existing performance-oriented task mapping with task duplication, a diversity-oriented mapping strategy and heterogeneity-aware majority voter. This extension uses a new metric to dynamically rate the remaining benefit of unreliable processing units and a memory management mechanism for automatic data transfers and checkpointing in the host and device memories.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/25/2019

Collaborative Heterogeneous Computing on MPSoCs

This thesis (extended abstract) presents the software development effort...
research
08/25/2022

Runtime reliability monitoring for complex fault-tolerance policies

Reliability of complex Cyber-Physical Systems is necessary to guarantee ...
research
08/23/2021

ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories

Spiking neural networks (SNNs) have shown a potential for having low ene...
research
06/26/2019

HEATS: Heterogeneity- and Energy-Aware Task-based Scheduling

Cloud providers usually offer diverse types of hardware for their users....
research
05/12/2016

A Fault Tolerance Improved Majority Voter for TMR System Architectures

For digital system designs, triple modular redundancy (TMR), which is a ...
research
06/09/2021

HyCA: A Hybrid Computing Architecture for Fault Tolerant Deep Learning

Hardware faults on the regular 2-D computing array of a typical deep lea...
research
07/03/2022

Representation Heterogeneity

Semantic Heterogeneity is conventionally understood as the existence of ...

Please sign up or login with your details

Forgot password? Click here to reset