Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

by   Weier Wan, et al.

Realizing today's cloud-level artificial intelligence functionalities directly on devices distributed at the edge of the internet calls for edge hardware capable of processing multiple modalities of sensory data (e.g. video, audio) at unprecedented energy-efficiency. AI hardware architectures today cannot meet the demand due to a fundamental "memory wall": data movement between separate compute and memory units consumes large energy and incurs long latency. Resistive random-access memory (RRAM) based compute-in-memory (CIM) architectures promise to bring orders of magnitude energy-efficiency improvement by performing computation directly within memory. However, conventional approaches to CIM hardware design limit its functional flexibility necessary for processing diverse AI workloads, and must overcome hardware imperfections that degrade inference accuracy. Such trade-offs between efficiency, versatility and accuracy cannot be addressed by isolated improvements on any single level of the design. By co-optimizing across all hierarchies of the design from algorithms and architecture to circuits and devices, we present NeuRRAM - the first multimodal edge AI chip using RRAM CIM to simultaneously deliver a high degree of versatility for diverse model architectures, record energy-efficiency 5× - 8× better than prior art across various computational bit-precisions, and inference accuracy comparable to software models with 4-bit weights on all measured standard AI benchmarks including accuracy of 99.0 classification, 84.7 reduction in image reconstruction error on a Bayesian image recovery task. This work paves a way towards building highly efficient and reconfigurable edge AI hardware platforms for the more demanding and heterogeneous AI applications of the future.


page 21

page 22

page 23

page 24

page 28

page 32

page 33

page 34


Bit-Line Computing for CNN Accelerators Co-Design in Edge AI Inference

By supporting the access of multiple memory words at the same time, Bit-...

ImageSig: A signature transform for ultra-lightweight image recognition

This paper introduces a new lightweight method for image recognition. Im...

Energy Efficiency Considerations for Popular AI Benchmarks

Advances in artificial intelligence need to become more resource-aware a...

Tuning Algorithms and Generators for Efficient Edge Inference

A surge in artificial intelligence and autonomous technologies have incr...

Breaking the Memory Wall for AI Chip with a New Dimension

Recent advancements in deep learning have led to the widespread adoption...

CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms

The rise of data-intensive applications exposed the limitations of conve...

CONVOLVE: Smart and seamless design of smart edge processors

With the rise of Deep Learning (DL), our world braces for AI in every ed...

Please sign up or login with your details

Forgot password? Click here to reset