Sapporo2: A versatile direct N-body library

10/14/2015
by   Jeroen Bédorf, et al.
0

Astrophysical direct N-body methods have been one of the first production algorithms to be implemented using NVIDIA's CUDA architecture. Now, almost seven years later, the GPU is the most used accelerator device in astronomy for simulating stellar systems. In this paper we present the implementation of the Sapporo2 N-body library, which allows researchers to use the GPU for N-body simulations with little to no effort. The first version, released five years ago, is actively used, but lacks advanced features and versatility in numerical precision and support for higher order integrators. In this updated version we have rebuilt the code from scratch and added support for OpenCL, multi-precision and higher order integrators. We show how to tune these codes for different GPU architectures and present how to continue utilizing the GPU optimal even when only a small number of particles (N < 100) is integrated. This careful tuning allows Sapporo2 to be faster than Sapporo1 even with the added options and double precision data loads. The code runs on a range of NVIDIA and AMD GPUs in single and double precision accuracy. With the addition of OpenCL support the library is also able to run on CPUs and other accelerators that support OpenCL.

READ FULL TEXT
research
10/21/2017

GooFit 2.0

The GooFit package provides physicists a simple, familiar syntax for man...
research
12/11/2019

High Accuracy Low Precision QR Factorization and Least Square Solver on GPU with TensorCore

Driven by the insatiable needs to process ever larger amount of data wit...
research
09/15/2023

Speeding up the GENGA N-body integrator on consumer-grade graphics cards

GPU computing is popular due to the calculation potential of a single ca...
research
11/22/2019

Titan: A Parallel Asynchronous Library for Multi-Agent and Soft-Body Robotics using NVIDIA CUDA

While most robotics simulation libraries are built for low-dimensional a...
research
10/13/2022

Precision QCD corrections to gluon-initiated diphoton-plus-jet production at the LHC

In this thesis, we present recent advances at the precision frontier of ...
research
09/22/2022

Embedding generic monadic transformer into Scala

Dotty-cps-async is an open-source package that consists of scala macro, ...

Please sign up or login with your details

Forgot password? Click here to reset