Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs

12/13/2020
by   Michael Beyer, et al.
0

Today, Deep Learning (DL) enhances almost every industrial sector, including safety-critical areas. The next generation of safety standards will define appropriate verification techniques for DL-based applications and propose adequate fault tolerance mechanisms. DL-based applications, like any other software, are susceptible to common random hardware faults such as bit flips, which occur in RAM and CPU registers. Such faults can lead to silent data corruption. Therefore, it is crucial to develop methods and tools that help to evaluate how DL components operate under the presence of such faults. In this paper, we introduce two new Fault Injection (FI) frameworks InjectTF and InjectTF2 for TensorFlow 1 and TensorFlow 2, respectively. Both frameworks are available on GitHub and allow the configurable injection of random faults into Neural Networks (NN). In order to demonstrate the feasibility of the frameworks, we also present the results of FI experiments conducted on four VGG-based Convolutional NNs using two image sets. The results demonstrate how random bit flips in the output of particular mathematical operations and layers of NNs affect the classification accuracy. These results help to identify the most critical operations and layers, compare the reliability characteristics of functionally similar NNs, and introduce selective fault tolerance mechanisms.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2019

Taxonomy of Real Faults in Deep Learning Systems

The growing application of deep neural networks in safety-critical domai...
research
09/07/2022

Hardware faults that matter: Understanding and Estimating the safety impact of hardware faults on object detection DNNs

Object detection neural network models need to perform reliably in highl...
research
10/16/2022

Towards Dynamic Fault Tolerance for Hardware-Implemented Artificial Neural Networks: A Deep Learning Approach

The functionality of electronic circuits can be seriously impaired by th...
research
02/05/2019

Enhancing Fault Tolerance of Neural Networks for Security-Critical Applications

Neural Networks (NN) have recently emerged as backbone of several sensit...
research
07/28/2023

SafeLS: Toward Building a Lockstep NOEL-V Core

Safety-critical systems such as those in automotive, avionics and space,...
research
07/06/2019

Adversarial Fault Tolerant Training for Deep Neural Networks

Deep Learning Accelerators are prone to faults which manifest in the for...
research
05/24/2022

Reliability Assessment of Neural Networks in GPUs: A Framework For Permanent Faults Injections

Currently, Deep learning and especially Convolutional Neural Networks (C...

Please sign up or login with your details

Forgot password? Click here to reset