Novel Model-based Methods for Performance Optimization of Multithreaded 2D Discrete Fourier Transform on Multicore Processors

08/16/2018
by   Semyon Khokhriakov, et al.
0

In this paper, we use multithreaded fast Fourier transforms provided in three highly optimized packages, FFTW-2.1.5, FFTW-3.3.7, and Intel MKL FFT, to present a novel model-based parallel computing technique as a very effective and portable method for optimization of scientific multithreaded routines for performance, especially in the current multicore era where the processors have abundant number of cores. We propose two optimization methods, PFFT-FPM and PFFT-FPM-PAD, based on this technique. They compute 2D-DFT of a complex signal matrix of size NxN using p abstract processors. Both algorithms take as inputs, discrete 3D functions of performance against problem size of the processors and output the transformed signal matrix. Based on our experiments on a modern Intel Haswell multicore server consisting of 36 physical cores, the average and maximum speedups observed for PFFT-FPM using FFTW-3.3.7 are 1.9x and 6.8x respectively and the average and maximum speedups observed using Intel MKL FFT are 1.3x and 2x respectively. The average and maximum speedups observed for PFFT-FPM-PAD using FFTW-3.3.7 are 2x and 9.4x respectively and the average and maximum speedups observed using Intel MKL FFT are 1.4x and 5.9x respectively.

READ FULL TEXT

page 6

page 7

page 9

page 10

page 12

page 16

page 18

page 19

research
03/26/2019

Matrix multiplication and universal scalability of the time on the Intel Scalable processors

Matrix multiplication is one of the core operations in many areas of sci...
research
08/02/2018

Parallelization of the FFT on SO(3)

In this paper, a work-optimal parallelization of Kostelec and Rockmore's...
research
02/09/2020

Large-Scale Discrete Fourier Transform on TPUs

In this work, we present two parallel algorithms for the large-scale dis...
research
10/15/2019

Modern Multicore CPUs are not Energy Proportional: Opportunity for Bi-objective Optimization for Performance and Energy

Energy proportionality is the key design goal followed by architects of ...
research
07/01/2019

Bridging the Architecture Gap: Abstracting Performance-Relevant Properties of Modern Server Processors

We describe a universal modeling approach for predicting single- and mul...
research
05/14/2017

Application of the Computer Capacity to the Analysis of Processors Evolution

The notion of computer capacity was proposed in 2012, and this quantity ...

Please sign up or login with your details

Forgot password? Click here to reset