Adaptively restarted block Krylov subspace methods with low-synchronization skeletons

08/21/2022
by   Kathryn Lund, et al.
0

With the recent realization of exascale performace by Oak Ridge National Laboratory's Frontier supercomputer, reducing communication in kernels like QR factorization has become even more imperative. Low-synchronization Gram-Schmidt methods, first introduced in [K. Świrydowicz, J. Langou, S. Ananthan, U. Yang, and S. Thomas, Low Synchronization Gram-Schmidt and Generalized Minimum Residual Algorithms, Numer. Lin. Alg. Appl., Vol. 28(2), e2343, 2020], have been shown to improve the scalability of the Arnoldi method in high-performance distributed computing. Block versions of low-synchronization Gram-Schmidt show further potential for speeding up algorithms, as column-batching allows for maximizing cache usage with matrix-matrix operations. In this work, low-synchronization block Gram-Schmidt variants from [E. Carson, K. Lund, M. Rozložník, and S. Thomas, Block Gram-Schmidt algorithms and their stability properties, Lin. Alg. Appl., 638, pp. 150–195, 2022] are transformed into block Arnoldi variants for use in block full orthogonalization methods (BFOM) and block generalized minimal residual methods (BGMRES). An adaptive restarting heuristic is developed to handle instabilities that arise with the increasing condition number of the Krylov basis. The performance, accuracy, and stability of these methods are assessed via a flexible benchmarking tool written in MATLAB. The modularity of the tool additionally permits generalized block inner products, like the global inner product.

READ FULL TEXT

page 19

page 21

page 22

page 24

page 25

page 26

page 27

research
10/22/2020

An overview of block Gram-Schmidt methods and their stability properties

Block Gram-Schmidt algorithms comprise essential kernels in many scienti...
research
10/17/2022

Using Mixed Precision in Low-Synchronization Reorthogonalized Block Classical Gram-Schmidt

Using lower precision in algorithms can be beneficial in terms of reduci...
research
01/23/2023

Augmented Block-Arnoldi Recycling CFD Solvers

One of the limitations of recycled GCRO methods is the large amount of c...
research
09/16/2018

Low synchronization GMRES algorithms

Communication-avoiding and pipelined variants of Krylov solvers are crit...
research
04/06/2021

Hardware-Oriented Krylov Methods for High-Performance Computing

Krylov subspace methods are an essential building block in numerical sim...
research
11/29/2021

Randomized block Gram-Schmidt process for solution of linear systems and eigenvalue problems

We propose a block version of the randomized Gram-Schmidt process for co...
research
11/10/2020

Randomized Gram-Schmidt process with application to GMRES

A randomized Gram-Schmidt algorithm is developed for orthonormalization ...

Please sign up or login with your details

Forgot password? Click here to reset