High-performance evaluation of high angular momentum 4-center Gaussian integrals on modern accelerated processors
We present a high-performance evaluation method for 4-center 2-particle integrals over Gaussian atomic orbitals with high angular momenta (l≥4) and arbitrary contraction degrees on graphical processing units (GPUs) and other accelerators. The implementation uses the matrix form of McMurchie-Davidson recurrences. Evaluation of the 4-center integrals over four l=6 (i) Gaussian AOs in the double precision (FP64) on an NVIDIA V100 GPU outperforms the reference implementation of the Obara-Saika recurrences (Libint) running on a single Intel Xeon core by more than a factor of 1000, healthily exceeding the 73:1 ratio of the respective hardware peak FLOP rates while reaching almost 50% of the V100 peak. The approach can be extended to support AOs with even higher angular momenta; for low angular momenta alternative approaches will be needed to achieve optimal performance. The implementation is part of an open-source LibintX library feely available at github.com:ValeevGroup/LibintX.
READ FULL TEXT