Soft-Error and Hard-fault Tolerant Architecture and Routing Algorithm for Reliable 3D-NoC Systems

03/21/2020
by   Khanh N. Dang, et al.
0

Network-on-Chip (NoC) paradigm has been proposed as an auspicious solution to handle the strict communication requirements between the increasingly large number of cores on a single multi and many-core chips. However, NoC systems are exposed to a variety of manufacturing, design and energetic particles factors making them vulnerable to permanent (hard) faults and transient (soft) errors. In this paper, we present a comprehensive soft error and hard fault tolerant 3D-NoC architecture, named 3D-Hard-Fault-Soft-Error-Tolerant-OASIS-NoC (3D-FETO). With the aid of adaptive algorithms, 3D-FETO is capable of detecting and recovering from soft errors occurring in the routing pipeline stages and is leveraging on reconfigurable components to handle permanent faults occurrence in links, input buffers, and crossbar. In-depth evaluation results show that the 3D-FETO system is able to work around different kinds of hard faults and soft errors while ensuring graceful performance degradation, minimizing the additional hardware complexity and remaining power-efficient.

READ FULL TEXT
research
03/21/2020

A low-overhead soft-hard fault-tolerant architecture, design and management scheme for reliable high-performance many-core 3D-NoC systems

The Network-on-Chip (NoC) paradigm has been proposed as a favorable solu...
research
03/21/2020

Reliability Assessment and Quantitative Evaluation of Soft-Error Resilient 3D Network-on-Chip Systems

Three-Dimensional Networks-on-Chips (3D-NoCs) have been proposed as an a...
research
12/12/2017

OpenSEA: Semi-Formal Methods for Soft Error Analysis

Alpha-particles and cosmic rays cause bit flips in chips. Protection cir...
research
05/01/2019

Fault-Tolerant Routing in Hypercube Networks by Avoiding Faulty Nodes

Next to the high performance, the essential feature of the multiprocesso...
research
03/12/2021

FT-GCR: a fault-tolerant generalized conjugate residual elliptic solver

With the steady advance of high performance computing systems featuring ...
research
02/08/2017

FASHION: Fault-Aware Self-Healing Intelligent On-chip Network

To avoid packet loss and deadlock scenarios that arise due to faults or ...
research
06/06/2018

Fault Tolerant Control for Networked Mobile Robots

Teams of networked autonomous agents have been used in a number of appli...

Please sign up or login with your details

Forgot password? Click here to reset