High Performance and Portable Convolution Operators for ARM-based Multicore Processors

05/13/2020
by   Pablo San Juan, et al.
0

The considerable impact of Convolutional Neural Networks on many Artificial Intelligence tasks has led to the development of various high performance algorithms for the convolution operator present in this type of networks. One of these approaches leverages the transform followed by a general matrix multiplication (GEMM) in order to take advantage of the highly optimized realizations of the GEMM kernel in many linear algebra libraries. The main problems of this approach are 1) the large memory workspace required to host the intermediate matrices generated by the IM2COL transform; and 2) the time to perform the IM2COL transform, which is not negligible for complex neural networks. This paper presents a portable high performance convolution algorithm based on the BLIS realization of the GEMM kernel that avoids the use of the intermediate memory by taking advantage of the BLIS structure. In addition, the proposed algorithm eliminates the cost of the explicit IM2COL transform, while maintaining the portability and performance of the underlying realization of GEMM in BLIS.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/15/2023

Toward matrix multiplication for deep learning inference on the Xilinx Versal

The remarkable positive impact of Deep Neural Networks on many Artificia...
research
07/11/2023

MG3MConv: Multi-Grained Matrix-Multiplication-Mapping Convolution Algorithm toward the SW26010 Processor

As the core of artificial intelligence applications, the research of con...
research
09/25/2021

NUMA-aware FFT-based Convolution on ARMv8 Many-core CPUs

Convolutional Neural Networks (CNNs), one of the most representative alg...
research
05/15/2023

Fast Matrix Multiplication via Compiler-only Layered Data Reorganization and Intrinsic Lowering

The resurgence of machine learning has increased the demand for high-per...
research
09/20/2018

High Performance Zero-Memory Overhead Direct Convolutions

The computation of convolution layers in deep neural networks typically ...
research
07/22/2019

Recursion, Probability, Convolution and Classification for Computations

The main motivation of this work was practical, to offer computationally...
research
12/15/2019

A Tutorial and Open Source Software for the Efficient Evaluation of Gravity and Magnetic Kernels

Fast computation of three-dimensional gravity and magnetic forward model...

Please sign up or login with your details

Forgot password? Click here to reset