Bit-Line Computing for CNN Accelerators Co-Design in Edge AI Inference

09/12/2022
by   Marco Rios, et al.
0

By supporting the access of multiple memory words at the same time, Bit-line Computing (BC) architectures allow the parallel execution of bit-wise operations in-memory. At the array periphery, arithmetic operations are then derived with little additional overhead. Such a paradigm opens novel opportunities for Artificial Intelligence (AI) at the edge, thanks to the massive parallelism inherent in memory arrays and the extreme energy efficiency of computing in-situ, hence avoiding data transfers. Previous works have shown that BC brings disruptive efficiency gains when targeting AI workloads, a key metric in the context of emerging edge AI scenarios. This manuscript builds on these findings by proposing an end-to-end framework that leverages BC-specific optimizations to enable high parallelism and aggressive compression of AI models. Our approach is supported by a novel hardware module performing real-time decoding, as well as new algorithms to enable BC-friendly model compression. Our hardware/software approach results in a 91 (for a 1 computing approaches.

READ FULL TEXT

page 4

page 6

page 7

page 8

page 10

page 11

page 12

page 13

research
08/17/2021

Edge AI without Compromise: Efficient, Versatile and Accurate Neurocomputing in Resistive Random-Access Memory

Realizing today's cloud-level artificial intelligence functionalities di...
research
05/25/2023

Benchmarking and modeling of analog and digital SRAM in-memory computing architectures

In-memory-computing is emerging as an efficient hardware paradigm for de...
research
12/19/2022

A Soft SIMD Based Energy Efficient Computing Microarchitecture

The ever-increasing size and computational complexity of today's machine...
research
05/23/2022

FAST: A Fully-Concurrent Access Technique to All SRAM Rows for Enhanced Speed and Energy Efficiency in Data-Intensive Applications

Compute-in-memory (CiM) is a promising approach to improving the computi...
research
08/12/2023

A 9 Transistor SRAM Featuring Array-level XOR Parallelism with Secure Data Toggling Operation

Security and energy-efficiency are critical for computing applications i...
research
11/08/2019

The Pitfall of Evaluating Performance on Emerging AI Accelerators

In recent years, domain-specific hardware has brought significant perfor...
research
05/05/2020

One-step regression and classification with crosspoint resistive memory arrays

Machine learning has been getting a large attention in the recent years,...

Please sign up or login with your details

Forgot password? Click here to reset