Chebyshev Filter Diagonalization on Modern Manycore Processors and GPGPUs

03/06/2018
by   Moritz Kreutzer, et al.
0

Chebyshev filter diagonalization is well established in quantum chemistry and quantum physics to compute bulks of eigenvalues of large sparse matrices. Choosing a block vector implementation, we investigate optimization opportunities on the new class of high-performance compute devices featuring both high-bandwidth and low-bandwidth memory. We focus on the transparent access to the full address space supported by both architectures under consideration: Intel Xeon Phi "Knights Landing" and Nvidia "Pascal." We propose two optimizations: (1) Subspace blocking is applied for improved performance and data access efficiency. We also show that it allows transparently handling problems much larger than the high-bandwidth memory without significant performance penalties. (2) Pipelining of communication and computation phases of successive subspaces is implemented to hide communication costs without extra memory traffic. As an application scenario we use filter diagonalization studies on topological insulator materials. Performance numbers on up to 512 nodes of the OakForest-PACS and Piz Daint supercomputers are presented, achieving beyond 100 Tflop/s for computing 100 inner eigenvalues of sparse matrices of dimension one billion.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/17/2019

Computing eigenvalues of matrices in a quantum computer

Eigenproblem arises in a large number of disciplines of sciences and eng...
research
10/15/2019

The Memory Controller Wall: Benchmarking the Intel FPGA SDK for OpenCL Memory Interface

Supported by their high power efficiency and recent advancements in High...
research
02/03/2016

An SSD-based eigensolver for spectral analysis on billion-node graphs

Many eigensolvers such as ARPACK and Anasazi have been developed to comp...
research
04/06/2021

Hardware-Oriented Krylov Methods for High-Performance Computing

Krylov subspace methods are an essential building block in numerical sim...
research
12/01/2020

Enhancing Scalability of a Matrix-Free Eigensolver for Studying Many-Body Localization

In [Van Beeumen, et. al, HPC Asia 2020, https://www.doi.org/10.1145/3368...
research
12/03/2019

Transforming the Lindblad Equation into a System of Linear Equations: Performance Optimization and Parallelization

Rapidly growing performance and memory capacity of modern supercomputers...

Please sign up or login with your details

Forgot password? Click here to reset