EcoFlow: Efficient Convolutional Dataflows for Low-Power Neural Network Accelerators

02/04/2022
by   Lois Orosa, et al.
15

Dilated and transposed convolutions are widely used in modern convolutional neural networks (CNNs). These kernels are used extensively during CNN training and inference of applications such as image segmentation and high-resolution image generation. Although these kernels have grown in popularity, they stress current compute systems due to their high memory intensity, exascale compute demands, and large energy consumption. We find that commonly-used low-power CNN inference accelerators based on spatial architectures are not optimized for both of these convolutional kernels. Dilated and transposed convolutions introduce significant zero padding when mapped to the underlying spatial architecture, significantly degrading performance and energy efficiency. Existing approaches that address this issue require significant design changes to the otherwise simple, efficient, and well-adopted architectures used to compute direct convolutions. To address this challenge, we propose EcoFlow, a new set of dataflows and mapping algorithms for dilated and transposed convolutions. These algorithms are tailored to execute efficiently on existing low-cost, small-scale spatial architectures and requires minimal changes to the network-on-chip of existing accelerators. EcoFlow eliminates zero padding through careful dataflow orchestration and data mapping tailored to the spatial architecture. EcoFlow enables flexible and high-performance transpose and dilated convolutions on architectures that are otherwise optimized for CNN inference. We evaluate the efficiency of EcoFlow on CNN training workloads and Generative Adversarial Network (GAN) training workloads. Experiments in our new cycle-accurate simulator show that EcoFlow 1) reduces end-to-end CNN training time between 7-85 29-42

READ FULL TEXT

page 6

page 12

page 18

research
05/07/2020

Optimizing Temporal Convolutional Network inference on FPGA-based accelerators

Convolutional Neural Networks are extensively used in a wide range of ap...
research
04/11/2019

YUVMultiNet: Real-time YUV multi-task CNN for autonomous driving

In this paper, we propose a multi-task convolutional neural network (CNN...
research
12/05/2017

On-Chip Communication Network for Efficient Training of Deep Convolutional Networks on Heterogeneous Manycore Systems

Convolutional Neural Networks (CNNs) have shown a great deal of success ...
research
07/03/2019

Accelerating Deconvolution on Unmodified CNN Accelerators for Generative Adversarial Networks -- A Software Approach

Generative Adversarial Networks (GANs) are the emerging machine learning...
research
04/12/2021

Optimizing the Whole-life Cost in End-to-end CNN Acceleration

The acceleration of CNNs has gained increasing atten-tion since their su...
research
11/08/2019

Communication Lower Bound in Convolution Accelerators

In current convolutional neural network (CNN) accelerators, communicatio...
research
11/22/2022

ArrayFlex: A Systolic Array Architecture with Configurable Transparent Pipelining

Convolutional Neural Networks (CNNs) are the state-of-the-art solution f...

Please sign up or login with your details

Forgot password? Click here to reset