Temporal blocking of finite-difference stencil operators with sparse "off-the-grid" sources

10/20/2020
by   George Bisbas, et al.
0

Stencil kernels dominate a range of scientific applications, including seismic and medical imaging, image processing, and neural networks. Temporal blocking is a performance optimization that aims to reduce the required memory bandwidth of stencil computations by re-using data from the cache for multiple time steps. It has already been shown to be beneficial for this class of algorithms. However, applying temporal blocking to practical applications' stencils remains challenging. These computations often consist of sparsely located operators not aligned with the computational grid ("off-the-grid"). Our work is motivated by modeling problems in which source injections result in wavefields that must then be measured at receivers by interpolation from the grided wavefield. The resulting data dependencies make the adoption of temporal blocking much more challenging. We propose a methodology to inspect these data dependencies and reorder the computation, leading to performance gains in stencil codes where temporal blocking has not been applicable. We implement this novel scheme in the Devito domain-specific compiler toolchain. Devito implements a domain-specific language embedded in Python to generate optimized partial differential equation solvers using the finite-difference method from high-level symbolic problem definitions. We evaluate our scheme using isotropic acoustic, anisotropic acoustic, and isotropic elastic wave propagators of industrial significance. After auto-tuning, performance evaluation shows that this enables substantial performance improvement through temporal blocking over highly-optimized vectorized spatially-blocked code of up to 1.6x.

READ FULL TEXT

page 1

page 6

research
08/06/2018

Devito: an embedded domain-specific language for finite differences and geophysical exploration

We introduce Devito, a new domain-specific language for implementing hig...
research
07/12/2017

Optimised finite difference computation from symbolic equations

Domain-specific high-productivity environments are playing an increasing...
research
07/09/2018

Architecture and performance of Devito, a system for automated stencil computation

Stencil computations are a key part of many high-performance computing a...
research
06/06/2023

Exploiting Scratchpad Memory for Deep Temporal Blocking: A case study for 2D Jacobian 5-point iterative stencil kernel (j2d5pt)

General Purpose Graphics Processing Units (GPGPU) are used in most of th...
research
05/12/2023

Revisiting Temporal Blocking Stencil Optimizations

Iterative stencils are used widely across the spectrum of High Performan...
research
08/20/2014

Code Generation for High-Level Synthesis of Multiresolution Applications on FPGAs

Multiresolution Analysis (MRA) is a mathematical method that is based on...
research
06/21/2018

Optimising finite-difference methods for PDEs through parameterised time-tiling in Devito

Finite-difference methods are widely used in solving partial differentia...

Please sign up or login with your details

Forgot password? Click here to reset