The Communication-Hiding Conjugate Gradient Method with Deep Pipelines

01/15/2018
by   Jeffrey Cornelis, et al.
0

Krylov subspace methods are among the most efficient present-day solvers for large scale linear algebra problems. Nevertheless, classic Krylov subspace method algorithms do not scale well on massively parallel hardware due to the synchronization bottlenecks induced by the computation of dot products throughout the algorithms. Communication-hiding pipelined Krylov subspace methods offer increased parallel scalability. One of the first published methods in this class is the pipelined Conjugate Gradient method (p-CG), which exhibits increased speedups on parallel machines. This is achieved by overlapping the time-consuming global communication phase with useful (independent) computations such as spmvs, hence reducing the impact of global communication as a synchronization bottleneck and avoiding excessive processor idling. However, on large numbers of processors the time spent in the global communication phase can be much higher than the time required for computing a single spmv. This work extends the pipelined CG method to deeper pipelines, which allows further scaling when the global communication phase is the dominant time-consuming factor. By overlapping the global all-to-all reduction phase in each CG iteration with the next l spmvs (pipelining), the method is able to hide communication latency behind computational work. The derivation of the p(l)-CG algorithm is based on the existing p(l)-GMRES method. Moreover, a number of theoretical and implementation properties of the p(l)-CG method are presented, including a preconditioned version of the algorithm. Experimental results are presented to demonstrate the possible performance gains of using deeper pipelines for solving large scale symmetric linear systems with the new CG method variant.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
05/15/2019

Improving strong scaling of the Conjugate Gradient method for solving large linear systems using global reduction pipelining

This paper presents performance results comparing MPI-based implementati...
research
02/08/2019

Numerically Stable Recurrence Relations for the Communication Hiding Pipelined Conjugate Gradient Method

Pipelined Krylov subspace methods (also referred to as communication-hid...
research
12/02/2019

Recent Developments in Iterative Methods for Reducing Synchronization

On modern parallel architectures, the cost of synchronization among proc...
research
09/16/2018

Low synchronization GMRES algorithms

Communication-avoiding and pipelined variants of Krylov solvers are crit...
research
09/06/2018

Analyzing and improving maximal attainable accuracy in the communication hiding pipelined BiCGStab method

Pipelined Krylov subspace methods avoid communication latency by reducin...
research
01/15/2017

The Adaptive s-step Conjugate Gradient Method

On modern large-scale parallel computers, the performance of Krylov subs...

Please sign up or login with your details

Forgot password? Click here to reset