MF-Net: Compute-In-Memory SRAM for Multibit Precision Inference using Memory-immersed Data Conversion and Multiplication-free Operators

01/29/2021
by   Shamma Nasrin, et al.
0

We propose a co-design approach for compute-in-memory inference for deep neural networks (DNN). We use multiplication-free function approximators based on ell_1 norm along with a co-adapted processing array and compute flow. Using the approach, we overcame many deficiencies in the current art of in-SRAM DNN processing such as the need for digital-to-analog converters (DACs) at each operating SRAM row/column, the need for high precision analog-to-digital converters (ADCs), limited support for multi-bit precision weights, and limited vector-scale parallelism. Our co-adapted implementation seamlessly extends to multi-bit precision weights, it doesn't require DACs, and it easily extends to higher vector-scale parallelism. We also propose an SRAM-immersed successive approximation ADC (SA-ADC), where we exploit the parasitic capacitance of bit lines of SRAM array as a capacitive DAC. Since the dominant area overhead in SA-ADC comes due to its capacitive DAC, by exploiting the intrinsic parasitic of SRAM array, our approach allows low area implementation of within-SRAM SA-ADC. Our 8×62 SRAM macro, which requires a 5-bit ADC, achieves ∼105 tera operations per second per Watt (TOPS/W) with 8-bit input/weight processing at 45 nm CMOS.

READ FULL TEXT

page 1

page 6

research
07/07/2023

Memory-Immersed Collaborative Digitization for Area-Efficient Compute-in-Memory Deep Learning

This work discusses memory-immersed collaborative digitization among com...
research
06/22/2019

Adaptive Precision CNN Accelerator Using Radix-X Parallel Connected Memristor Crossbars

Neural processor development is reducing our reliance on remote server a...
research
09/04/2023

ADC/DAC-Free Analog Acceleration of Deep Neural Networks with Frequency Transformation

The edge processing of deep neural networks (DNNs) is becoming increasin...
research
12/10/2017

A Scalable High-Performance Priority Encoder Using 1D-Array to 2D-Array Conversion

In our prior study of an L-bit priority encoder (PE), a so-called one-di...
research
12/21/2020

A complete, parallel and autonomous photonic neural network in a semiconductor multimode laser

Neural networks are one of the disruptive computing concepts of our time...
research
06/16/2021

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

Recent works demonstrated the promise of using resistive random access m...
research
09/29/2022

Wafer-Scale Fast Fourier Transforms

We have implemented fast Fourier transforms for one, two, and three-dime...

Please sign up or login with your details

Forgot password? Click here to reset