GPU Fast Convolution via the Overlap-and-Save Method in Shared Memory

10/04/2019
by   Karel Adámek, et al.
0

We present an implementation of the overlap-and-save method, a method for the convolution of very long signals with short response functions, which is tailored to GPUs. We have implemented several FFT algorithms (using the CUDA programming language) which exploit GPU shared memory, allowing for GPU accelerated convolution. We compare our implementation with an implementation of the overlap-and-save algorithm utilizing the NVIDIA FFT library (cuFFT). We demonstrate that by using a shared memory based FFT we can achieved significant speed-ups for certain problem sizes and lower the memory requirements of the overlap-and-save method on GPUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/05/2020

MGPU-TSM: A Multi-GPU System with Truly Shared Memory

The sizes of GPU applications are rapidly growing. They are exhausting t...
research
05/12/2020

Porting and optimizing UniFrac for GPUs

UniFrac is a commonly used metric in microbiome research for comparing m...
research
11/28/2017

Implementing implicit OpenMP data sharing on GPUs

OpenMP is a shared memory programming model which supports the offloadin...
research
05/21/2022

MapReduce for Counting Word Frequencies with MPI and GPUs

In this project, the goal was to use the Julia programming language and ...
research
12/12/2017

Intra-node Memory Safe GPU Co-Scheduling

GPUs in High-Performance Computing systems remain under-utilised due to ...
research
07/07/2017

GPU-Accelerated Algorithms for Compressed Signals Recovery with Application to Astronomical Imagery Deblurring

Compressive sensing promises to enable bandwidth-efficient on-board comp...
research
07/21/2017

Memory-Efficient Implementation of DenseNets

The DenseNet architecture is highly computationally efficient as a resul...

Please sign up or login with your details

Forgot password? Click here to reset