Minimizing communication in the multidimensional FFT

03/22/2022
by   Thomas Koopman, et al.
0

We present a parallel algorithm for the fast Fourier transform (FFT) in higher dimensions. This algorithm generalizes the cyclic-to-cyclic one-dimensional parallel algorithm to a cyclic-to-cyclic multidimensional parallel algorithm while retaining the property of needing only a single all-to-all communication step. This is under the constraint that we use at most √(N) processors for an FFT on an array with a total of N elements, irrespective of the dimension d or shape of the array. The only assumption we make is that N is sufficiently composite. Our algorithm starts and ends in the same distribution. We present our multidimensional implementation FFTU which utilizes the sequential FFTW program for its local FFTs, and which can handle any dimension d. We obtain experimental results for d≤ 5 using MPI on up to 4096 cores of the supercomputer Snellius, comparing FFTU with the parallel FFTW program and with PFFT. These results show that FFTU is competitive with the state-of-the-art and that it allows to use a larger number of processors, while keeping communication limited to a single all-to-all operation. For arrays of size 1024^3 and 64^5, FFTU achieves a speedup of a factor 149 and 176, respectively, on 4096 processors.

READ FULL TEXT

page 2

page 3

research
02/12/2020

CROFT: A scalable three-dimensional parallel Fast Fourier Transform (FFT) implementation for High Performance Clusters

The FFT of three-dimensional (3D) input data is an important computation...
research
04/25/2018

Fast parallel multidimensional FFT using advanced MPI

We present a new method for performing global redistributions of multidi...
research
01/31/2021

The Zero Cubes Free and Cubes Unique Multidimensional Constraints

This paper studies two families of constraints for two-dimensional and m...
research
08/04/2021

Combinatorial Algorithms for Multidimensional Necklaces

A necklace is an equivalence class of words of length n over an alphabet...
research
02/09/2020

Large-Scale Discrete Fourier Transform on TPUs

In this work, we present two parallel algorithms for the large-scale dis...
research
02/03/2022

Parallel domain discretization algorithm for RBF-FD and other meshless numerical methods for solving PDEs

In this paper, we present a novel parallel dimension-independent node po...
research
09/29/2022

Wafer-Scale Fast Fourier Transforms

We have implemented fast Fourier transforms for one, two, and three-dime...

Please sign up or login with your details

Forgot password? Click here to reset