Numerically Stable Variants of the Communication-hiding Pipelined Conjugate Gradients Algorithm for the Parallel Solution of Large Scale Symmetric Linear Systems
By reducing the number of global synchronization bottlenecks per iteration and hiding communication behind useful computational work, pipelined Krylov subspace methods achieve significantly improved parallel scalability on present-day HPC hardware. However, this typically comes at the cost of a reduced maximal attainable accuracy. This paper presents and compares several stabilized versions of the communication-hiding pipelined Conjugate Gradients method. The main novel contribution of this work is the reformulation of the multi-term recurrence pipelined CG algorithm by introducing shifts in the recursions for specific auxiliary variables. These shifts reduce the amplification of local rounding errors on the residual. The stability analysis presented in this work provides a rigorous method for selection of the optimal shift value in practice. It is shown that, given a proper choice for the shift parameter, the resulting shifted pipelined CG algorithm restores the attainable accuracy and displays nearly identical robustness to local rounding error propagation compared to classical CG. Numerical results on a variety of SPD benchmark problems compare different stabilization techniques for the pipelined CG algorithm, showing that the shifted pipelined CG algorithm is able to attain a high accuracy while displaying excellent parallel performance.
READ FULL TEXT