Parallel memory-efficient all-at-once algorithms for the sparse matrix triple products in multigrid methods

05/21/2019
by   Fande Kong, et al.
0

Multilevel/multigrid methods is one of the most popular approaches for solving a large sparse linear system of equations, typically, arising from the discretization of partial differential equations. One critical step in the multilevel/multigrid methods is to form coarse matrices through a sequence of sparse matrix triple products. A commonly used approach for the triple products explicitly involves two steps, and during each step a sparse matrix-matrix multiplication is employed. This approach works well for many applications with a good computational efficiency, but it has a high memory overhead since some auxiliary matrices need to be temporarily stored for accomplishing the calculations. In this work, we propose two new algorithms that construct a coarse matrix with taking one pass through the input matrices without involving any auxiliary matrices for saving memory. The new approaches are referred to as "all-at-once" and "merged all-at-once", and the traditional method is denoted as "two-step". The all-at-once and the merged all-at-once algorithms are implemented based on hash tables in PETSc as part of this work with a careful consideration on the performance in terms of the compute time and the memory usage. We numerically show that the proposed algorithms and their implementations are perfectly scalable in both the compute time and the memory usage with up to 32,768 processor cores for a model problem with 27 billions of unknowns. The scalability is also demonstrated for a realistic neutron transport problem with over 2 billion unknowns on a supercomputer with 10,000 processor cores. Compared with the traditional two-step method, the all-at-once and the merged all-at-once algorithms consume much less memory for both the model problem and the realistic neutron transport problem meanwhile they are able to maintain the computational efficiency.

READ FULL TEXT

page 1

page 10

research
06/18/2019

A scalable multilevel domain decomposition preconditioner with a subspace-based coarsening algorithm for the neutron transport calculations

The multigroup neutron transport equations has been widely used to study...
research
03/06/2018

Scaling Structured Multigrid to 500K+ Cores through Coarse-Grid Redistribution

The efficient solution of sparse, linear systems resulting from the disc...
research
12/02/2016

Implementation and evaluation of data-compression algorithms for irregular-grid iterative methods on the PEZY-SC processor

Iterative methods on irregular grids have been used widely in all areas ...
research
08/23/2019

Stencil scaling for vector-valued PDEs on hybrid grids with applications to generalized Newtonian fluids

Matrix-free finite element implementations for large applications provid...
research
08/23/2019

Stencil scaling for vector-valued PDEs with applications to generalized Newtonian fluids

Matrix-free finite element implementations for large applications provid...

Please sign up or login with your details

Forgot password? Click here to reset