A scalable H-matrix approach for the solution of boundary integral equations on multi-GPU clusters

06/20/2018
by   Helmut Harbrecht, et al.
0

In this work, we consider the solution of boundary integral equations by means of a scalable hierarchical matrix approach on clusters equipped with graphics hardware, i.e. graphics processing units (GPUs). To this end, we extend our existing single-GPU hierarchical matrix library hmglib such that it is able to scale on many GPUs and such that it can be coupled to arbitrary application codes. Using a model GPU implementation of a boundary element method (BEM) solver, we are able to achieve more than 67 percent relative parallel speed-up going from 128 to 1024 GPUs for a model geometry test case with 1.5 million unknowns and a real-world geometry test case with almost 1.2 million unknowns. On 1024 GPUs of the cluster Titan, it takes less than 6 minutes to solve the 1.5 million unknowns problem, with 5.7 minutes for the setup phase and 20 seconds for the iterative solver. To the best of the authors' knowledge, we here discuss the first fully GPU-based distributed-memory parallel hierarchical matrix Open Source library using the traditional H-matrix format and adaptive cross approximation with an application to BEM problems.

READ FULL TEXT

page 13

page 16

research
03/04/2023

Multi-GPU aggregation-based AMG preconditioner for iterative linear solvers

We present and release in open source format a sparse linear solver whic...
research
08/31/2017

Algorithmic patterns for H-matrices on many-core processors

In this work, we consider the reformulation of hierarchical (H) matrix a...
research
03/27/2020

Dielectric breakdown prediction with GPU-accelerated BEM

The prediction of a dielectric breakdown in a high-voltage device is bas...
research
09/12/2021

H2Opus: A distributed-memory multi-GPU software package for non-local operators

Hierarchical ℋ^2-matrices are asymptotically optimal representations for...
research
02/24/2020

Optimizing High Performance Markov Clustering for Pre-Exascale Architectures

HipMCL is a high-performance distributed memory implementation of the po...
research
11/06/2017

Simple and efficient GPU parallelization of existing H-Matrix accelerated BEM code

In this paper, we demonstrate how GPU-accelerated BEM routines can be us...
research
09/28/2021

The Megopolis Resampler: Memory Coalesced Resampling on GPUs

The resampling process employed in widely used methods such as Importanc...

Please sign up or login with your details

Forgot password? Click here to reset