Understanding the Impact of On-chip Communication on DNN Accelerator Performance

12/03/2019
by   Robert Guirado, et al.
0

Deep Neural Networks have flourished at an unprecedented pace in recent years. They have achieved outstanding accuracy in fields such as computer vision, natural language processing, medicine or economics. Specifically, Convolutional Neural Networks (CNN) are particularly suited to object recognition or identification tasks. This, however, comes at a high computational cost, prompting the use of specialized GPU architectures or even ASICs to achieve high speeds and energy efficiency. ASIC accelerators streamline the execution of certain dataflows amenable to CNN computation that imply the constant movement of large amounts of data, thereby turning on-chip communication into a critical function within the accelerator. This paper studies the communication flows within CNN inference accelerators of edge devices, with the aim to justify current and future decisions in the design of the on-chip networks that interconnect their processing elements. Leveraging this analysis, we then qualitatively discuss the potential impact of introducing the novel paradigm of wireless on-chip network in this context.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/20/2016

Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Convolutional neural networks (CNN) have achieved major breakthroughs in...
research
11/08/2019

Communication Lower Bound in Convolution Accelerators

In current convolutional neural network (CNN) accelerators, communicatio...
research
07/06/2021

Impact of On-Chip Interconnect on In-Memory Acceleration of Deep Neural Networks

With the widespread use of Deep Neural Networks (DNNs), machine learning...
research
12/05/2017

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Convolutional Neural Networks (CNNs) have shown a great deal of success ...
research
03/25/2023

Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures

Executing machine learning inference tasks on resource-constrained edge ...
research
08/26/2021

Efficient On-Chip Communication for Parallel Graph-Analytics on Spatial Architectures

Large-scale graph processing has drawn great attention in recent years. ...
research
12/02/2018

Training for 'Unstable' CNN Accelerator:A Case Study on FPGA

With the great advancements of convolution neural networks(CNN), CNN acc...

Please sign up or login with your details

Forgot password? Click here to reset