ILU Smoothers for Low Mach Navier-Stokes Pressure Solvers

by   Stephen Thomas, et al.

Incomplete LU (ILU) smoothers are effective in the algebraic multigrid (AMG) V-cycle for reducing high-frequency components of the error. However, the requisite direct triangular solves are comparatively slow on GPUs. Previous work by Antz et al. (2015) demonstrated the advantages of Jacobi iteration as an alternative to direct solution of these systems. Depending on the threshold and fill-level parameters chosen, the factors can be highly non-normal and, in this case, Jacobi is unlikely to converge in a low number of iterations. We demonstrate that row scaling can reduce the departure from normality, allowing us to replace the inherently sequential solve with a rapidly converging Richardson iteration. There are several advantages beyond the lower compute time. Scaling is performed locally for a diagonal block of the global matrix because it is applied directly to the factor. Further, an ILUT Schur complement smoother maintains a constant GMRES iteration count as the number of MPI ranks increases, and thus parallel strong-scaling, is improved. Our algorithms have been incorporated into hypre, and we demonstrate improved time to solution for Nalu-Wind and PeleLM pressure solvers. For large problem sizes, GMRES+AMG executes at least five times faster when using iterative triangular solves compared with direct solves on massively-parallel GPUs.


page 5

page 6


Neumann Series in GMRES and Algebraic Multigrid Smoothers

Neumann series underlie both Krylov methods and algebraic multigrid smoo...

A Direct Õ(1/ε) Iteration Parallel Algorithm for Optimal Transport

Optimal transportation, or computing the Wasserstein or “earth mover's” ...

A Hybrid Direct-Iterative Method for Solving KKT Linear Systems

We propose a solution strategy for linear systems arising in interior me...

Inexact subdomain solves using deflated GMRES for Helmholtz problems

We examine the use of a two-level deflation preconditioner combined with...

Low-Synch Gram-Schmidt with Delayed Reorthogonalization for Krylov Solvers

The parallel strong-scaling of Krylov iterative methods is largely deter...

Partitioned Coupling vs. Monolithic Block-Preconditioning Approaches for Solving Stokes-Darcy Systems

We consider the time-dependent Stokes-Darcy problem as a model case for ...

Two-Stage Gauss–Seidel Preconditioners and Smoothers for Krylov Solvers on a GPU cluster

Gauss-Seidel (GS) relaxation is often employed as a preconditioner for a...

Please sign up or login with your details

Forgot password? Click here to reset