Yield Loss Reduction and Test of AI and Deep Learning Accelerators

06/08/2020
by   Mehdi Sadi, et al.
0

With data-driven analytics becoming mainstream, the global demand for dedicated AI and Deep Learning accelerator chips is soaring. These accelerators, designed with densely packed Processing Elements (PE), are especially vulnerable to the manufacturing defects and functional faults common in the advanced semiconductor process nodes resulting in significant yield loss. In this work, we demonstrate an application-driven methodology to reduce the yield loss of AI accelerators by correlating the circuit faults in the PEs of the accelerator with the desired accuracy of the AI workload execution. We exploit the error-healing properties of backpropagation during training, and the inherent fault tolerance features of trained deep learning models during inference to develop the presented yield loss reduction and test methodology. An analytical relationship is derived between fault location, fault rate, and the AI task`s accuracy for deciding if the accelerator chip can pass the final yield test. An yield-loss reduction aware fault isolation, ATPG, and test flow are presented for the multiply and accumulate units of the PEs. Results obtained with widely used AI/deep learning benchmarks demonstrate the efficacy of the proposed approach in the reduction of yield loss of AI accelerator designs while maintaining the desired accuracy of AI tasks.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/06/2021

Designing Efficient and High-performance AI Accelerators with Customized STT-MRAM

In this paper, we demonstrate the design of efficient and high-performan...
research
05/21/2023

FAQ: Mitigating the Impact of Faults in the Weight Memory of DNN Accelerators through Fault-Aware Quantization

Permanent faults induced due to imperfections in the manufacturing proce...
research
12/02/2018

Training for 'Unstable' CNN Accelerator:A Case Study on FPGA

With the great advancements of convolution neural networks(CNN), CNN acc...
research
04/20/2023

eFAT: Improving the Effectiveness of Fault-Aware Training for Mitigating Permanent Faults in DNN Hardware Accelerators

Fault-Aware Training (FAT) has emerged as a highly effective technique f...
research
03/22/2023

System and Design Technology Co-optimization of SOT-MRAM for High-Performance AI Accelerator Memory System

SoCs are now designed with their own AI accelerator segment to accommoda...
research
04/08/2023

RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults

To maximize the performance and energy efficiency of Spiking Neural Netw...
research
05/10/2020

Power and Accuracy of Multi-Layer Perceptrons (MLPs) under Reduced-voltage FPGA BRAMs Operation

In this paper, we exploit the aggressive supply voltage underscaling tec...

Please sign up or login with your details

Forgot password? Click here to reset