Semi-Lagrangian Vlasov simulation on GPUs

07/18/2019
by   Lukas Einkemmer, et al.
0

In this paper, our goal is to efficiently solve the Vlasov equation on GPUs. A semi-Lagrangian discontinuous Galerkin scheme is used for the discretization. Such kinetic computations are extremely expensive due to the high-dimensional phase space. The SLDG code, which is publicly available under the MIT license abstracts the number of dimensions and uses a shared codebase for both GPU and CPU based simulations. We investigate the performance of the implementation on a range of both Tesla (V100, Titan V, K80) and consumer (GTX 1080 Ti) GPUs. Our implementation is typically able to achieve a performance of approximately 470 GB/s on a single GPU and 1600 GB/s on four V100 GPUs connected via NVLink. This results in a speedup of about a factor of ten (comparing a single GPU with a dual socket Intel Xeon Gold node) and approximately a factor of 35 (comparing a single node with and without GPUs). In addition, we investigate the effect of single precision computation on the performance of the SLDG code and demonstrate that a template based dimension independent implementation can achieve good performance regardless of the dimensionality of the problem.

READ FULL TEXT

page 9

page 17

research
03/10/2023

A performance portable implementation of the semi-Lagrangian algorithm in six dimensions

In this paper, we describe our approach to develop a simulation software...
research
06/28/2021

Leveraging GPU batching for scalable nonlinear programming through massive Lagrangian decomposition

We present the implementation of a trust-region Newton algorithm ExaTron...
research
09/15/2023

Speeding up the GENGA N-body integrator on consumer-grade graphics cards

GPU computing is popular due to the calculation potential of a single ca...
research
05/12/2020

Porting and optimizing UniFrac for GPUs

UniFrac is a commonly used metric in microbiome research for comparing m...
research
03/22/2016

A mixed precision semi-Lagrangian algorithm and its performance on accelerators

In this paper we propose a mixed precision algorithm in the context of t...
research
07/16/2021

Refactoring the MPS/University of Chicago Radiative MHD(MURaM) Model for GPU/CPU Performance Portability Using OpenACC Directives

The MURaM (Max Planck University of Chicago Radiative MHD) code is a sol...

Please sign up or login with your details

Forgot password? Click here to reset