The MPI + CUDA Gaia AVU-GSR Parallel Solver Toward Next-generation Exascale Infrastructures

08/01/2023
by   Valentina Cesare, et al.
0

We ported to the GPU with CUDA the Astrometric Verification Unit-Global Sphere Reconstruction (AVU-GSR) Parallel Solver developed for the ESA Gaia mission, by optimizing a previous OpenACC porting of this application. The code aims to find, with a [10,100]μas precision, the astrometric parameters of ∼10^8 stars, the attitude and instrumental settings of the Gaia satellite, and the global parameter γ of the parametrized Post-Newtonian formalism, by solving a system of linear equations, A× x=b, with the LSQR iterative algorithm. The coefficient matrix A of the final Gaia dataset is large, with ∼10^11× 10^8 elements, and sparse, reaching a size of ∼10-100 TB, typical for the Big Data analysis, which requires an efficient parallelization to obtain scientific results in reasonable timescales. The speedup of the CUDA code over the original AVU-GSR solver, parallelized on the CPU with MPI+OpenMP, increases with the system size and the number of resources, reaching a maximum of ∼14x, >9x over the OpenACC application. This result is obtained by comparing the two codes on the CINECA cluster Marconi100, with 4 V100 GPUs per node. After verifying the agreement between the solutions of a set of systems with different sizes computed with the CUDA and the OpenMP codes and that the solutions showed the required precision, the CUDA code was put in production on Marconi100, essential for an optimal AVU-GSR pipeline and the successive Gaia Data Releases. This analysis represents a first step to understand the (pre-)Exascale behavior of a class of applications that follow the same structure of this code. In the next months, we plan to run this code on the pre-Exascale platform Leonardo of CINECA, with 4 next-generation A200 GPUs per node, toward a porting on this infrastructure, where we expect to obtain even higher performances.

READ FULL TEXT

page 5

page 7

research
12/22/2022

The Gaia AVU-GSR parallel solver: preliminary studies of a LSQR-based application in perspective of exascale systems

The Gaia Astrometric Verification Unit-Global Sphere Reconstruction (AVU...
research
09/04/2017

From MPI to MPI+OpenACC: Conversion of a legacy FORTRAN PCG solver for the spherical Laplace equation

A real-world example of adding OpenACC to a legacy MPI FORTRAN Precondit...
research
09/01/2021

Accelerating an Iterative Eigensolver for Nuclear Structure Configuration Interaction Calculations on GPUs using OpenACC

To accelerate the solution of large eigenvalue problems arising from man...
research
04/10/2023

An Experimental Study of Two-Level Schwarz Domain Decomposition Preconditioners on GPUs

The generalized Dryja–Smith–Widlund (GDSW) preconditioner is a two-level...
research
10/26/2020

Parallelizing multiple precision Taylor series method for integrating the Lorenz system

A hybrid MPI+OpenMP strategy for parallelizing multiple precision Taylor...
research
03/24/2020

Gadget3 on GPUs with OpenACC

We present preliminary results of a GPU porting of all main Gadget3 modu...
research
05/30/2018

Monodromy Solver: Sequential and Parallel

We describe, study, and experiment with an algorithm for finding all sol...

Please sign up or login with your details

Forgot password? Click here to reset