An Efficient Vectorization Scheme for Stencil Computation

by   Kun Li, et al.

Stencil computation is one of the most important kernels in various scientific and engineering applications. A variety of work has focused on vectorization and tiling techniques, aiming at exploiting the in-core data parallelism and data locality respectively. In this paper, the downsides of existing vectorization schemes are analyzed. Briefly, they either incur data alignment conflicts or hurt the data locality when integrated with tiling. Then we propose a novel transpose layout to preserve the data locality for tiling and reduce the data reorganization overhead for vectorization simultaneously. To further improve the data reuse at the register level, a time loop unroll-and-jam strategy is designed to perform multistep stencil computation along the time dimension. Experimental results on the AVX-2 and AVX-512 CPUs show that our approach obtains a competitive performance.



There are no comments yet.


page 1

page 2

page 3

page 4


Reducing Redundancy in Data Organization and Arithmetic Calculation for Stencil Computations

Stencil computation is one of the most important kernels in various scie...

Optimizer Fusion: Efficient Training with Better Locality and Parallelism

Machine learning frameworks adopt iterative optimizers to train neural n...

Temporal Vectorization for Stencils

Stencil computations represent a very common class of nested loops in sc...

Kernelized Locality-Sensitive Hashing for Semi-Supervised Agglomerative Clustering

Large scale agglomerative clustering is hindered by computational burden...

Locality Optimized Unstructured Mesh Algorithms on GPUs

Unstructured-mesh based numerical algorithms such as finite volume and f...

Guidelines for enhancing data locality in selected machine learning algorithms

To deal with the complexity of the new bigger and more complex generatio...

Performance Portable Back-projection Algorithms on CPUs: Agnostic Data Locality and Vectorization Optimizations

Computed Tomography (CT) is a key 3D imaging technology that fundamental...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.