An Experimental Study of Two-Level Schwarz Domain Decomposition Preconditioners on GPUs

04/10/2023
by   Ichitaro Yamazaki, et al.
0

The generalized Dryja–Smith–Widlund (GDSW) preconditioner is a two-level overlapping Schwarz domain decomposition (DD) preconditioner that couples a classical one-level overlapping Schwarz preconditioner with an energy-minimizing coarse space. When used to accelerate the convergence rate of Krylov subspace iterative methods, the GDSW preconditioner provides robustness and scalability for the solution of sparse linear systems arising from the discretization of a wide range of partial different equations. In this paper, we present FROSch (Fast and Robust Schwarz), a domain decomposition solver package which implements GDSW-type preconditioners for both CPU and GPU clusters. To improve the solver performance on GPUs, we use a novel decomposition to run multiple MPI processes on each GPU, reducing both solver's computational and storage costs and potentially improving the convergence rate. This allowed us to obtain competitive or faster performance using GPUs compared to using CPUs alone. We demonstrate the performance of FROSch on the Summit supercomputer with NVIDIA V100 GPUs, where we used NVIDIA Multi-Process Service (MPS) to implement our decomposition strategy. The solver has a wide variety of algorithmic and implementation choices, which poses both opportunities and challenges for its GPU implementation. We conduct a thorough experimental study with different solver options including the exact or inexact solution of the local overlapping subdomain problems on a GPU. We also discuss the effect of using the iterative variant of the incomplete LU factorization and sparse-triangular solve as the approximate local solver, and using lower precision for computing the whole FROSch preconditioner. Overall, the solve time was reduced by factors of about 2× using GPUs, while the GPU acceleration of the numerical setup time depend on the solver options and the local matrix sizes.

READ FULL TEXT

page 1

page 5

research
05/29/2023

CPU-GPU Heterogeneous Code Acceleration of a Finite Volume Computational Fluid Dynamics Solver

This work deals with the CPU-GPU heterogeneous code acceleration of a fi...
research
03/15/2023

A Two-level GPU-Accelerated Incomplete LU Preconditioner for General Sparse Linear Systems

This paper presents a parallel preconditioning approach based on incompl...
research
10/21/2017

GooFit 2.0

The GooFit package provides physicists a simple, familiar syntax for man...
research
06/04/2020

Multi-GPU Performance Optimization of a CFD Code using OpenACC on Different Platforms

This paper investigates the multi-GPU performance of a 3D buoyancy drive...
research
09/20/2023

An Evaluation and Comparison of GPU Hardware and Solver Libraries for Accelerating the OPM Flow Reservoir Simulator

Realistic reservoir simulation is known to be prohibitively expensive in...
research
04/02/2021

Two-Stage Gauss–Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU cluster

Gauss-Seidel (GS) relaxation is often employed as a preconditioner for a...
research
08/01/2023

The MPI + CUDA Gaia AVU-GSR Parallel Solver Toward Next-generation Exascale Infrastructures

We ported to the GPU with CUDA the Astrometric Verification Unit-Global ...

Please sign up or login with your details

Forgot password? Click here to reset