Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

08/17/2021
by   Weier Wan, et al.
39

Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency 5× - 8× better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0 classification, 84.7 reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.

READ FULL TEXT

page 21

page 22

page 23

page 24

page 28

page 32

page 33

page 34

research
09/12/2022

Bit-Line Computing for CNN Accelerators Co-Design in Edge AI Inference

By supporting the access of multiple memory words at the same time, Bit-...
research
05/13/2022

ImageSig: A signature transform for ultra-lightweight image recognition

This paper introduces a new lightweight method for image recognition. Im...
research
04/17/2023

Energy Efficiency Considerations for Popular AI Benchmarks

Advances in artificial intelligence need to become more resource-aware a...
research
07/31/2019

Tuning Algorithms and Generators for Efficient Edge Inference

A surge in artificial intelligence and autonomous technologies have incr...
research
09/28/2020

Breaking the Memory Wall for AI Chip with a New Dimension

Recent advancements in deep learning have led to the widespread adoption...
research
12/25/2022

CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms

The rise of data-intensive applications exposed the limitations of conve...
research
12/01/2022

CONVOLVE: Smart and seamless design of smart edge processors

With the rise of Deep Learning (DL), our world braces for AI in every ed...

Please sign up or login with your details

Forgot password? Click here to reset