The Indirect Convolution Algorithm

07/03/2019
by   Marat Dukhan, et al.
2

Deep learning frameworks commonly implement convolution operators with GEMM-based algorithms. In these algorithms, convolution is implemented on top of matrix-matrix multiplication (GEMM) functions, provided by highly optimized BLAS libraries. Convolutions with 1x1 kernels can be directly represented as a GEMM call, but convolutions with larger kernels require a special memory layout transformation - im2col or im2row - to fit into GEMM interface. The Indirect Convolution algorithm provides the efficiency of the GEMM primitive without the overhead of im2col transformation. In contrast to GEMM-based algorithms, the Indirect Convolution does not reshuffle the data to fit into the GEMM primitive but introduces an indirection buffer - a buffer of pointers to the start of each row of image pixels. This broadens the application of our modified GEMM function to convolutions with arbitrary kernel size, padding, stride, and dilation. The Indirect Convolution algorithm reduces memory overhead proportionally to the number of input channels and outperforms the GEMM-based algorithm by up to 62 GEMM-based algorithms. This, however, comes at cost of minor performance reduction on 1x1 stride-1 convolutions.

READ FULL TEXT

page 6

page 7

research
06/21/2017

MEC: Memory-efficient Convolution for Deep Neural Network

Convolution is a critical component in modern deep neural networks, thus...
research
06/25/2023

Im2win: Memory Efficient Convolution On SIMD Architectures

Convolution is the most expensive operation among neural network operati...
research
06/25/2023

Im2win: An Efficient Convolution Paradigm on GPU

Convolution is the most time-consuming operation in deep neural network ...
research
09/20/2018

High Performance Zero-Memory Overhead Direct Convolutions

The computation of convolution layers in deep neural networks typically ...
research
06/24/2022

Towards Effective Depthwise Convolutions on ARMv8 Architecture

Depthwise convolutions are widely used in lightweight convolutional neur...
research
03/20/2019

Convolution with even-sized kernels and symmetric padding

Compact convolutional neural networks gain efficiency mainly through dep...
research
07/25/2019

HUGE2: a Highly Untangled Generative-model Engine for Edge-computing

As a type of prominent studies in deep learning, generative models have ...

Please sign up or login with your details

Forgot password? Click here to reset