DeepAI AI Chat
Log In Sign Up

Array relocation approach for radial scanning algorithms on multi-GPU systems: total viewshed problem as a case study

by   A. J. Sanchez, et al.

In geographic information systems, Digital Elevation Models (DEMs) are commonly processed using radial scanning based algorithms. These algorithms are particularly popular when calculating parameters whose magnitudes decrease with the distance squared such as those related to radio signals, sound waves, and human eyesight. However, radial scanning algorithms imply a large number of accesses to 2D arrays, which despite being regular, results in poor data locality. This paper proposes a new methodology, termed sDEM, which substantially improves the locality of memory accesses and largely increases the inherent parallelism involved in the computation of radial scanning algorithms. In particular, sDEM applies a data restructuring technique prior to accessing the memory and performing the computation. In order to demonstrate the high efficiency of sDEM, we use the problem of total viewshed computation as a case study. Sequential, parallel, single-GPU and multi-GPU implementations are analyzed and compared with the state-of-the-art total viewshed computation algorithm. Experiments show that sDEM achieves an acceleration rate of up to 827.3 times for the best multi-GPU execution approach with respect to the best multi-core implementation.


page 10

page 15


GPU-Accelerated Computation of Vietoris-Rips Persistence Barcodes

The computation of Vietoris-Rips persistence barcodes is both execution-...

Out-of-Core GPU Gradient Boosting

GPU-based algorithms have greatly accelerated many machine learning meth...

GGArray: A Dynamically Growable GPU Array

We present a dynamically Growable GPU array (GGArray) fully implemented ...

Multi-Objective Task Assignment and Multiagent Planning with Hybrid GPU-CPU Acceleration

Allocation and planning with a collection of tasks and a group of agents...

Acceleration for Timing-Aware Gate-Level Logic Simulation with One-Pass GPU Parallelism

Witnessing the advancing scale and complexity of chip design and benefit...

Scanning and Sequential Decision Making for Multi-Dimensional Data - Part II: the Noisy Case

We consider the problem of sequential decision making on random fields c...

Implementing CUDA Streams into AstroAccelerate – A Case Study

To be able to run tasks asynchronously on NVIDIA GPUs a programmer must ...