Multiscale Co-Design Analysis of Energy, Latency, Area, and Accuracy of a ReRAM Analog Neural Training Accelerator

07/31/2017
by   Matthew J. Marinella, et al.
0

Neural networks are an increasingly attractive algorithm for natural language processing and pattern recognition applications. Deep networks with >50M parameters made possible by modern GPU clusters operating at <50 pJ per op and more recently, production accelerators capable of <5pJ per operation at the board level. However, with the slowing of CMOS scaling, new paradigms will be required to achieve the next several orders of magnitude in performance per watt gains. Using an analog resistive memory (ReRAM) crossbar to perform key matrix operations in an accelerator is an attractive option that is gaining significant interest. This work presents a detailed design using a state of the art 14/16 nm PDK for of an analog crossbar circuit block designed to process three key kernels required in training and inference of neural networks. A detailed circuit and device-level analysis of energy, latency, area, and accuracy are given and compared to relevant designs using standard digital ReRAM and SRAM operations. It is shown that the analog accelerator has a 310x energy and 270x latency advantage over a similar block utilizing only digital ReRAM and takes only 11 fJ per multiply and accumulate (MAC). Although training accuracy is degraded in the analog accelerator, several options to improve this are presented. The possible gains over a similar digital-only version of this accelerator block suggest that continued optimization of analog resistive memories is valuable. This detailed circuit and device analysis of a training accelerator may serve as a foundation for further architecture-level studies.

READ FULL TEXT
research
06/15/2023

Leveraging Residue Number System for Designing High-Precision Analog Deep Neural Network Accelerators

Achieving high accuracy, while maintaining good energy efficiency, in an...
research
04/21/2022

MRAM-based Analog Sigmoid Function for In-memory Computing

We propose an analog implementation of the transcendental activation fun...
research
09/03/2021

On the Accuracy of Analog Neural Network Inference Accelerators

Specialized accelerators have recently garnered attention as a method to...
research
01/30/2022

Neural-PIM: Efficient Processing-In-Memory with Neural Approximation of Peripherals

Processing-in-memory (PIM) architectures have demonstrated great potenti...
research
10/02/2022

A Python Framework for SPICE Circuit Simulation of In-Memory Analog Computing Circuits

With the increased attention to memristive-based in-memory analog comput...
research
08/12/2019

Design space exploration of Ferroelectric FET based Processing-in-Memory DNN Accelerator

In this letter, we quantify the impact of device limitations on the clas...
research
04/15/2022

AID: Accuracy Improvement of Analog Discharge-Based in-SRAM Multiplication Accelerator

This paper presents a novel circuit (AID) to improve the accuracy of an ...

Please sign up or login with your details

Forgot password? Click here to reset