Impact of On-Chip Interconnect on In-Memory Acceleration of Deep Neural Networks

07/06/2021
by   Gokul Krishnan, et al.
53

With the widespread use of Deep Neural Networks (DNNs), machine learning algorithms have evolved in two diverse directions – one with ever-increasing connection density for better accuracy and the other with more compact sizing for energy efficiency. The increase in connection density increases on-chip data movement, which makes efficient on-chip communication a critical function of the DNN accelerator. The contribution of this work is threefold. First, we illustrate that the point-to-point (P2P)-based interconnect is incapable of handling a high volume of on-chip data movement for DNNs. Second, we evaluate P2P and network-on-chip (NoC) interconnect (with a regular topology such as a mesh) for SRAM- and ReRAM-based in-memory computing (IMC) architectures for a range of DNNs. This analysis shows the necessity for the optimal interconnect choice for an IMC DNN accelerator. Finally, we perform an experimental evaluation for different DNNs to empirically obtain the performance of the IMC architecture with both NoC-tree and NoC-mesh. We conclude that, at the tile level, NoC-tree is appropriate for compact DNNs employed at the edge, and NoC-mesh is necessary to accelerate DNNs with high connection density. Furthermore, we propose a technique to determine the optimal choice of interconnect for any given DNN. In this technique, we use analytical models of NoC to evaluate end-to-end communication latency of any given DNN. We demonstrate that the interconnect optimization in the IMC architecture results in up to 6× improvement in energy-delay-area product for VGG-19 inference compared to the state-of-the-art ReRAM-based IMC architectures.

READ FULL TEXT

page 1

page 15

page 16

page 17

research
08/14/2021

SIAM: Chiplet-based Scalable In-Memory Acceleration with Mesh for Deep Neural Networks

In-memory computing (IMC) on a monolithic chip for deep learning faces d...
research
08/17/2020

DORY: Automatic End-to-End Deployment of Real-World DNNs on Low-Cost IoT MCUs

The deployment of Deep Neural Networks (DNNs) on end-nodes at the extrem...
research
03/03/2022

Weightless Neural Networks for Efficient Edge Inference

Weightless Neural Networks (WNNs) are a class of machine learning model ...
research
12/03/2019

Understanding the Impact of On-chip Communication on DNN Accelerator Performance

Deep Neural Networks have flourished at an unprecedented pace in recent ...
research
07/10/2018

Eyeriss v2: A Flexible and High-Performance Accelerator for Emerging Deep Neural Networks

The design of DNNs has increasingly focused on reducing the computationa...
research
05/04/2023

CAMEL: Co-Designing AI Models and Embedded DRAMs for Efficient On-Device Learning

The emergence of the Internet of Things (IoT) has resulted in a remarkab...
research
07/18/2021

Domino: A Tailored Network-on-Chip Architecture to Enable Highly Localized Inter- and Intra-Memory DNN Computing

The ever-increasing computation complexity of fast-growing Deep Neural N...

Please sign up or login with your details

Forgot password? Click here to reset