Improving DNN Fault Tolerance using Weight Pruning and Differential Crossbar Mapping for ReRAM-based Edge AI

by   Geng Yuan, et al.

Recent research demonstrated the promise of using resistive random access memory (ReRAM) as an emerging technology to perform inherently parallel analog domain in-situ matrix-vector multiplication – the intensive and key computation in deep neural networks (DNNs). However, hardware failure, such as stuck-at-fault defects, is one of the main concerns that impedes the ReRAM devices to be a feasible solution for real implementations. The existing solutions to address this issue usually require an optimization to be conducted for each individual device, which is impractical for mass-produced products (e.g., IoT devices). In this paper, we rethink the value of weight pruning in ReRAM-based DNN design from the perspective of model fault tolerance. And a differential mapping scheme is proposed to improve the fault tolerance under a high stuck-on fault rate. Our method can tolerate almost an order of magnitude higher failure rate than the traditional two-column method in representative DNN tasks. More importantly, our method does not require extra hardware cost compared to the traditional two-column mapping scheme. The improvement is universal and does not require the optimization process for each individual device.


page 1

page 2

page 3

page 4

page 5

page 6


FTPipeHD: A Fault-Tolerant Pipeline-Parallel Distributed Training Framework for Heterogeneous Edge Devices

With the increased penetration and proliferation of Internet of Things (...

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

Recent works demonstrated the promise of using resistive random access m...

Bulk-Switching Memristor-based Compute-In-Memory Module for Deep Neural Network Training

The need for deep neural network (DNN) models with higher performance an...

Analyzing and Mitigating the Impact of Permanent Faults on a Systolic Array Based Neural Network Accelerator

Due to their growing popularity and computational cost, deep neural netw...

IoTRepair: Systematically Addressing Device Faults in Commodity IoT (Extended Paper)

IoT devices are decentralized and deployed in un-stable environments, wh...

Fatal Brain Damage

The loss of a few neurons in a brain often does not result in a visible ...

Hardware-Robust In-RRAM-Computing for Object Detection

In-memory computing is becoming a popular architecture for deep-learning...

Please sign up or login with your details

Forgot password? Click here to reset