Automated Tiling of Unstructured Mesh Computations with Application to Seismological Modelling

08/10/2017
by   Fabio Luporini, et al.
0

Sparse tiling is a technique to fuse loops that access common data, thus increasing data locality. Unlike traditional loop fusion or blocking, the loops may have different iteration spaces and access shared datasets through indirect memory accesses, such as A[map[i]] -- hence the name "sparse". One notable example of such loops arises in discontinuous-Galerkin finite element methods, because of the computation of numerical integrals over different domains (e.g., cells, facets). The major challenge with sparse tiling is implementation -- not only is it cumbersome to understand and synthesize, but it is also onerous to maintain and generalize, as it requires a complete rewrite of the bulk of the numerical computation. In this article, we propose an approach to extend the applicability of sparse tiling based on raising the level of abstraction. Through a sequence of compiler passes, the mathematical specification of a problem is progressively lowered, and eventually sparse-tiled C for-loops are generated. Besides automation, we advance the state-of-the-art by introducing: a revisited, more efficient sparse tiling algorithm; support for distributed-memory parallelism; a range of fine-grained optimizations for increased run-time performance; implementation in a publicly-available library, SLOPE; and an in-depth study of the performance impact in Seigen, a real-world elastic wave equation solver for seismological problems, which shows speed-ups up to 1.28x on a platform consisting of 896 Intel Broadwell cores.

READ FULL TEXT

page 18

page 24

research
04/03/2017

Loop Tiling in Large-Scale Stencil Codes at Run-time with OPS

The key common bottleneck in most stencil codes is data movement, and pr...
research
05/23/2022

SparseLNR: Accelerating Sparse Tensor Computations Using Loop Nest Restructuring

Sparse tensor algebra computations have become important in many real-wo...
research
02/11/2018

Locality Optimized Unstructured Mesh Algorithms on GPUs

Unstructured-mesh based numerical algorithms such as finite volume and f...
research
08/25/2022

Polyhedral Specification and Code Generation of Sparse Tensor Contraction with Co-Iteration

This paper presents a code generator for sparse tensor contraction compu...
research
04/12/2022

Sparse grid time-discontinuous Galerkin method with streamline diffusion for transport equations

High-dimensional transport equations frequently occur in science and eng...
research
06/04/2020

A Memory-efficient Implementation of Perfectly Matched Layer with Smoothly-varying Coefficients in Discontinuous Galerkin Time-Domain Method

Wrapping a computation domain with a perfectly matched layer (PML) is on...
research
03/17/2022

FUSED-PAGERANK: Loop-Fusion based Approximate PageRank

PageRank is a graph centrality metric that gives the importance of each ...

Please sign up or login with your details

Forgot password? Click here to reset