End-to-end 100-TOPS/W Inference With Analog In-Memory Computing: Are We There Yet?

09/03/2021
by   Gianmarco Ottavi, et al.
0

In-Memory Acceleration (IMA) promises major efficiency improvements in deep neural network (DNN) inference, but challenges remain in the integration of IMA within a digital system. We propose a heterogeneous architecture coupling 8 RISC-V cores with an IMA in a shared-memory cluster, analyzing the benefits and trade-offs of in-memory computing on the realistic use case of a MobileNetV2 bottleneck layer. We explore several IMA integration strategies, analyzing performance, area, and energy efficiency. We show that while pointwise layers achieve significant speed-ups over software implementation, on depthwise layer the inability to efficiently map parameters on the accelerator leads to a significant trade-off between throughput and area. We propose a hybrid solution where pointwise convolutions are executed on IMA while depthwise on the cluster cores, achieving a speed-up of 3x over SW execution while saving 50 when compared to an all-in IMA solution with similar performance.

READ FULL TEXT

page 1

page 2

page 4

research
01/04/2022

A Heterogeneous In-Memory Computing Cluster For Flexible End-to-End Inference of Real-World Deep Neural Networks

Deployment of modern TinyML tasks on small battery-constrained IoT devic...
research
11/23/2022

End-to-End DNN Inference on a Massively Parallel Analog In Memory Computing Architecture

The demand for computation resources and energy efficiency of Convolutio...
research
04/12/2021

ENOS: Energy-Aware Network Operator Search for Hybrid Digital and Compute-in-Memory DNN Accelerators

This work proposes a novel Energy-Aware Network Operator Search (ENOS) a...
research
08/10/2023

Shared Memory-contention-aware Concurrent DNN Execution for Diversely Heterogeneous System-on-Chips

Two distinguishing features of state-of-the-art mobile and autonomous sy...
research
07/25/2020

Jointly Optimizing Preprocessing and Inference for DNN-based Visual Analytics

While deep neural networks (DNNs) are an increasingly popular way to que...
research
05/25/2023

Are We There Yet? Product Quantization and its Hardware Acceleration

Conventional multiply-accumulate (MAC) operations have long dominated co...
research
11/19/2019

Stream Semantic Registers: A Lightweight RISC-V ISA Extension Achieving Full Compute Utilization in Single-Issue Cores

Single-issue processor cores are very energy efficient but suffer from t...

Please sign up or login with your details

Forgot password? Click here to reset