Data-Oriented Language Implementation of Lattice-Boltzmann Method for Dense and Sparse Geometries

by   Tadeusz Tomczak, et al.

The performance of lattice-Boltzmann solver implementations usually depends mainly on memory access patterns. Achieving high performance requires then complex code which handles careful data placement and ordering of memory transactions. In this work, we analyse the performance of an implementation based on a new approach called the data-oriented language, which allows the combining of complex memory access patterns with simple source code. As a use case, we present and provide the source code of a solver for D2Q9 lattice and show its performance on GTX Titan Xp GPU for dense and sparse geometries up to 4096 2 nodes. The obtained results are promising, around 1000 lines of code allowed us to achieve performance in the range of 0.6 to 0.7 of maximum theoretical memory bandwidth (over 2.5 and 5.0 GLUPS for double and single precision, respectively) for meshes of size above 1024 2 nodes, which is close to the current state-of-the-art. However, we also observed relatively high and sometimes difficult to predict overheads, especially for sparse data structures. The additional issue was also a rather long compilation, which extended the time of short simulations, and a lack of access to low-level optimisation mechanisms.



There are no comments yet.


page 7

page 9

page 14


A new GPU implementation for lattice-Boltzmann simulations on sparse geometries

We describe a high-performance implementation of the lattice Boltzmann m...

Lattice Boltzmann Benchmark Kernels as a Testbed for Performance Analysis

Lattice Boltzmann methods (LBM) are an important part of current computa...

Cross-platform programming model for many-core lattice Boltzmann simulations

We present a novel, hardware-agnostic implementation strategy for lattic...

An Abstract Method Linearization for Detecting Source Code Plagiarism in Object-Oriented Environment

Despite the fact that plagiarizing source code is a trivial task for mos...

Modeling memory bandwidth patterns on NUMA machines with performance counters

Computers used for data analytics are often NUMA systems with multiple s...

Spatter: A Benchmark Suite for Evaluating Sparse Access Patterns

Recent characterizations of data movement performance have evaluated opt...

Opening the Black Box: Performance Estimation during Code Generation for GPUs

Automatic code generation is frequently used to create implementations o...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.