Understanding the Impact of On-chip Communication on DNN Accelerator Performance

by   Robert Guirado, et al.

Deep Neural Networks have flourished at an unprecedented pace in recent years. They have achieved outstanding accuracy in fields such as computer vision, natural language processing, medicine or economics. Specifically, Convolutional Neural Networks (CNN) are particularly suited to object recognition or identification tasks. This, however, comes at a high computational cost, prompting the use of specialized GPU architectures or even ASICs to achieve high speeds and energy efficiency. ASIC accelerators streamline the execution of certain dataflows amenable to CNN computation that imply the constant movement of large amounts of data, thereby turning on-chip communication into a critical function within the accelerator. This paper studies the communication flows within CNN inference accelerators of edge devices, with the aim to justify current and future decisions in the design of the on-chip networks that interconnect their processing elements. Leveraging this analysis, we then qualitatively discuss the potential impact of introducing the novel paradigm of wireless on-chip network in this context.


page 1

page 2

page 3

page 4


Ristretto: Hardware-Oriented Approximation of Convolutional Neural Networks

Convolutional neural networks (CNN) have achieved major breakthroughs in...

Communication Lower Bound in Convolution Accelerators

In current convolutional neural network (CNN) accelerators, communicatio...

Impact of On-Chip Interconnect on In-Memory Acceleration of Deep Neural Networks

With the widespread use of Deep Neural Networks (DNNs), machine learning...

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Convolutional Neural Networks (CNNs) have shown a great deal of success ...

Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures

Executing machine learning inference tasks on resource-constrained edge ...

Efficient On-Chip Communication for Parallel Graph-Analytics on Spatial Architectures

Large-scale graph processing has drawn great attention in recent years. ...

Training for 'Unstable' CNN Accelerator:A Case Study on FPGA

With the great advancements of convolution neural networks(CNN), CNN acc...

Please sign up or login with your details

Forgot password? Click here to reset