Parallelization of the FFT on SO(3)

08/02/2018
by   Denis-Michael Lux, et al.
0

In this paper, a work-optimal parallelization of Kostelec and Rockmore's well-known fast Fourier transform and its inverse on the three-dimensional rotation group SO(3) is designed, implemented, and tested. To this end, the sequential algorithms are reviewed briefly first. In the subsequent design and implementation of the parallel algorithms, we use the well-known Forster (PCAM) method and the OpenMP standard. The parallelization itself is based on symmetries of the underlying basis functions and a geometric approach in which the resulting index range is transformed in such a way that distinct work packages can be distributed efficiently to the computation nodes. The benefit of the parallel algorithms in practice is demonstrated in a speedup- and efficiency-assessing benchmark test on a system with 64 cores. Here, for the first time, we present positive results for the full transforms for the both accuracy- and memory-critical bandwidth 512. Using all 64 available cores, the speedup for the largest considered bandwidths 128, 256, and 512 amounted to 29.57, 36.86, and 34.36 in the forward, and 24.57, 26.69, and 24.25 in the inverse transform, respectively.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/08/2017

Block-Parallel IDA* for GPUs (Extended Manuscript)

We investigate GPU-based parallelization of Iterative-Deepening A* (IDA*...
research
07/29/2018

Automatic Parallelization of Sequential Programs

Prior work on Automatically Scalable Computation (ASC) suggests that it ...
research
06/20/2018

Parallelization of XPath Queries using Modern XQuery Processors

A practical and promising approach to parallelizing XPath queries was pr...
research
08/16/2018

Novel Model-based Methods for Performance Optimization of Multithreaded 2D Discrete Fourier Transform on Multicore Processors

In this paper, we use multithreaded fast Fourier transforms provided in ...
research
11/20/2015

mplrs: A scalable parallel vertex/facet enumeration code

We describe a new parallel implementation, mplrs, of the vertex enumerat...
research
02/09/2020

Large-Scale Discrete Fourier Transform on TPUs

In this work, we present two parallel algorithms for the large-scale dis...
research
09/29/2022

Wafer-Scale Fast Fourier Transforms

We have implemented fast Fourier transforms for one, two, and three-dime...

Please sign up or login with your details

Forgot password? Click here to reset