NUMA-aware FFT-based Convolution on ARMv8 Many-core CPUs

09/25/2021
by   Xiandong Huang, et al.
0

Convolutional Neural Networks (CNNs), one of the most representative algorithms of deep learning, are widely used in various artificial intelligence applications. Convolution operations often take most of the computational overhead of CNNs. The FFT-based algorithm can improve the efficiency of convolution by reducing its algorithm complexity, there are a lot of works about the high-performance implementation of FFT-based convolution on many-core CPUs. However, there is no optimization for the non-uniform memory access (NUMA) characteristics in many-core CPUs. In this paper, we present a NUMA-aware FFT-based convolution implementation on ARMv8 many-core CPUs with NUMA architectures. The implementation can reduce a number of remote memory access through the data reordering of FFT transformations and the three-level parallelization of the complex matrix multiplication. The experiment results on a ARMv8 many-core CPU with NUMA architectures demonstrate that our NUMA-aware implementation has much better performance than the state-of-the-art work in most cases.

READ FULL TEXT

page 1

page 2

page 3

page 4

page 5

page 6

page 7

research
07/11/2023

MG3MConv: Multi-Grained Matrix-Multiplication-Mapping Convolution Algorithm toward the SW26010 Processor

As the core of artificial intelligence applications, the research of con...
research
09/08/2022

Kernel-Segregated Transpose Convolution Operation

Transpose convolution has shown prominence in many deep learning applica...
research
10/12/2018

Compact NUMA-Aware Locks

Modern multi-socket architectures exhibit non-uniform memory access (NUM...
research
03/04/2019

Efficient Winograd or Cook-Toom Convolution Kernel Implementation on Widely Used Mobile CPUs

The Winograd or Cook-Toom class of algorithms help to reduce the overall...
research
05/13/2020

High Performance and Portable Convolution Operators for ARM-based Multicore Processors

The considerable impact of Convolutional Neural Networks on many Artific...
research
11/01/2021

Fast Convolution based on Winograd Minimum Filtering: Introduction and Development

Convolutional Neural Network (CNN) has been widely used in various field...
research
01/24/2021

Analytical Characterization and Design Space Exploration for Optimization of CNNs

Moving data through the memory hierarchy is a fundamental bottleneck tha...

Please sign up or login with your details

Forgot password? Click here to reset