Predict-and-recompute conjugate gradient variants

05/04/2019
by   Tyler Chen, et al.
0

The standard implementation of the conjugate gradient algorithm suffers from communication bottlenecks on parallel architectures, due primarily to the two global reductions required every iteration. In this paper, we introduce several predict-and-recompute type conjugate gradient variants, which decrease the runtime per iteration by overlapping global synchronizations, and in the case of our pipelined variants, matrix vector products. Through the use of a predict-and-recompute scheme, whereby recursively updated quantities are first used as a predictor for their true values and then recomputed exactly at a later point in the iteration, our variants are observed to have convergence properties nearly as good as the standard conjugate gradient problem implementation on every problem we tested. It is also verified experimentally that our variants do indeed reduce runtime per iteration in practice, and that they scale similarly to previously studied communication hiding variants. Finally, because our variants achieve good convergence without the use of any additional input parameters, they have the potential to be used in place of the standard conjugate gradient implementation in a range of applications.

READ FULL TEXT
research
05/04/2019

New communication hiding conjugate gradient variants

The conjugate gradient algorithm suffers from communication bottlenecks ...
research
06/17/2023

Gradient-type subspace iteration methods for the symmetric eigenvalue problem

This paper explores variants of the subspace iteration algorithm for com...
research
05/15/2019

Improving strong scaling of the Conjugate Gradient method for solving large linear systems using global reduction pipelining

This paper presents performance results comparing MPI-based implementati...
research
11/05/2015

Stop Wasting My Gradients: Practical SVRG

We present and analyze several strategies for improving the performance ...
research
10/19/2021

Performance of Low Synchronization Orthogonalization Methods in Anderson Accelerated Fixed Point Solvers

Anderson Acceleration (AA) is a method to accelerate the convergence of ...
research
12/04/2022

Convergence under Lipschitz smoothness of ease-controlled Random Reshuffling gradient Algorithms

We consider minimizing the average of a very large number of smooth and ...
research
03/21/2022

Training Quantised Neural Networks with STE Variants: the Additive Noise Annealing Algorithm

Training quantised neural networks (QNNs) is a non-differentiable optimi...

Please sign up or login with your details

Forgot password? Click here to reset