FuSeConv: Fully Separable Convolutions for Fast Inference on Systolic Arrays

05/27/2021
by   Surya Selvam, et al.
40

Both efficient neural networks and hardware accelerators are being explored to speed up DNN inference on edge devices. For example, MobileNet uses depthwise separable convolution to achieve much lower latency, while systolic arrays provide much higher performance per watt. Interestingly however, the combination of these two ideas is inefficient: The computational patterns of depth-wise separable convolution are not systolic and lack data reuse to saturate the systolic array's constrained dataflow. In this paper, we propose FuSeConv (Fully-Separable Convolution) as a drop-in replacement for depth-wise separable convolution. FuSeConv generalizes the decomposition of convolutions fully to separable 1D convolutions along spatial and depth dimensions. The resultant computation is systolic and efficiently utilizes the systolic array with a slightly modified dataflow. With FuSeConv, we achieve a significant speed-up of 3x-7x with the MobileNet family of networks on a systolic array of size 64x64, with comparable accuracy on the ImageNet dataset. The high speed-up motivates exploration of hardware-aware Neural Operator Search (NOS) in complement to ongoing efforts on Neural Architecture Search (NAS).

READ FULL TEXT

page 3

page 4

research
10/21/2019

Depth-wise Decomposition for Accelerating Separable Convolutions in Efficient Convolutional Neural Networks

Very deep convolutional neural networks (CNNs) have been firmly establis...
research
07/21/2022

Efficient CNN Architecture Design Guided by Visualization

Modern efficient Convolutional Neural Networks(CNNs) always use Depthwis...
research
10/02/2020

Rotated Ring, Radial and Depth Wise Separable Radial Convolutions

Simple image rotations significantly reduce the accuracy of deep neural ...
research
05/09/2022

Augmentations: An Insight into their Effectiveness on Convolution Neural Networks

Augmentations are the key factor in determining the performance of any n...
research
12/12/2018

Concentrated-Comprehensive Convolutions for lightweight semantic segmentation

The semantic segmentation requires a lot of computational cost. The dila...
research
04/29/2021

CASSOD-Net: Cascaded and Separable Structures of Dilated Convolution for Embedded Vision Systems and Applications

The field of view (FOV) of convolutional neural networks is highly relat...
research
01/28/2023

Efficient Latency-Aware CNN Depth Compression via Two-Stage Dynamic Programming

Recent works on neural network pruning advocate that reducing the depth ...

Please sign up or login with your details

Forgot password? Click here to reset