Improving the Performance of the GMRES Method using Mixed-Precision Techniques

11/03/2020
by   Neil Lindquist, et al.
0

The GMRES method is used to solve sparse, non-symmetric systems of linear equations arising from many scientific applications. The solver performance within a single node is memory bound, due to the low arithmetic intensity of its computational kernels. To reduce the amount of data movement, and thus, to improve performance, we investigated the effect of using a mix of single and double precision while retaining double-precision accuracy. Previous efforts have explored reduced precision in the preconditioner, but the use of reduced precision in the solver itself has received limited attention. We found that GMRES only needs double precision in computing the residual and updating the approximate solution to achieve double-precision accuracy, although it must restart after each improvement of single-precision accuracy. This finding holds for the tested orthogonalization schemes: Modified Gram-Schmidt (MGS) and Classical Gram-Schmidt with Re-orthogonalization (CGSR). Furthermore, our mixed-precision GMRES, when restarted at least once, performed 19 faster on average than double-precision GMRES for MGS and CGSR, respectively. Our implementation uses generic programming techniques to ease the burden of coding implementations for different data types. Our use of the Kokkos library allowed us to exploit parallelism and optimize data management. Additionally, KokkosKernels was used when producing performance results. In conclusion, using a mix of single and double precision in GMRES can improve performance while retaining double-precision accuracy.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/02/2021

A Study of Mixed Precision Strategies for GMRES on GPUs

Support for lower precision computation is becoming more common in accel...
research
05/09/2019

An Algebraic Approach to Fast Estimation of the Threshold Voltage of Junctionless Double Gate MOSFETs Using the Gram Schmidt Method

The effect of decreasing Drain-Induced Barrier Lowering (DIBL) is one of...
research
08/21/2023

Hierarchical Lowrank Arithmetic with Binary Compression

With lowrank approximation the storage requirements for dense data are r...
research
12/11/2019

High Accuracy Low Precision QR Factorization and Least Square Solver on GPU with TensorCore

Driven by the insatiable needs to process ever larger amount of data wit...
research
05/16/2021

Experimental Evaluation of Multiprecision Strategies for GMRES on GPUs

Support for lower precision computation is becoming more common in accel...
research
08/09/2021

Implementation of high-precision computation capabilities into the open-source dynamic simulation framework YADE

This paper deals with the implementation of arbitrary precision calculat...
research
11/16/2016

An Analysis of Tournament Structure

This paper explores a novel way for analyzing the tournament structures ...

Please sign up or login with your details

Forgot password? Click here to reset