High-performance evaluation of high angular momentum 4-center Gaussian integrals on modern accelerated processors

07/07/2023
by   Andrey Asadchev, et al.
0

We present a high-performance evaluation method for 4-center 2-particle integrals over Gaussian atomic orbitals with high angular momenta (l≥4) and arbitrary contraction degrees on graphical processing units (GPUs) and other accelerators. The implementation uses the matrix form of McMurchie-Davidson recurrences. Evaluation of the 4-center integrals over four l=6 (i) Gaussian AOs in the double precision (FP64) on an NVIDIA V100 GPU outperforms the reference implementation of the Obara-Saika recurrences (Libint) running on a single Intel Xeon core by more than a factor of 1000, healthily exceeding the 73:1 ratio of the respective hardware peak FLOP rates while reaching almost 50% of the V100 peak. The approach can be extended to support AOs with even higher angular momenta; for low angular momenta alternative approaches will be needed to achieve optimal performance. The implementation is part of an open-source LibintX library feely available at github.com:ValeevGroup/LibintX.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/06/2022

Memory-Efficient Recursive Evaluation of 3-Center Gaussian Integrals

To improve the efficiency of Gaussian integral evaluation on modern acce...
research
07/28/2020

Performance Analysis of Noise Subspace-based Narrowband Direction-of-Arrival (DOA) Estimation Algorithms on CPU and GPU

High-performance computing of array signal processing problems is a crit...
research
08/15/2018

libhclooc: Software Library Facilitating Out-of-core Implementations of Accelerator Kernels on Hybrid Computing Platforms

Hardware accelerators such as Graphics Processing Units (GPUs), Intel Xe...
research
04/08/2019

Accelerated Neural Networks on OpenCL Devices Using SYCL-DNN

Over the past few years machine learning has seen a renewed explosion of...
research
11/17/2021

Accelerating JPEG Decompression on GPUs

The JPEG compression format has been the standard for lossy image compre...
research
05/16/2022

AnySeq/GPU: A Novel Approach for Faster Sequence Alignment on GPUs

In recent years, the rapidly increasing number of reads produced by next...
research
10/10/2017

SoAx: A generic C++ Structure of Arrays for handling Particles in HPC Codes

The numerical study of physical problems often require integrating the d...

Please sign up or login with your details

Forgot password? Click here to reset